Unify executor workload queues by anishgirianish · Pull Request #63491 · apache/airflow

anishgirianish · 2026-03-12T23:22:19Z

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Summary

Refactors executor workload queue management for extensibility. No behavioral change , scheduling order, slot accounting, and all provider executors work identically to before.

Follows the direction proposed by @ferruzzi #62343 (comment).

Problem

Adding a new workload type (like ExecuteCallback or TestConnection) required touching ~6 places in BaseExecutor: a new queue dict, a new supports_* flag,slots calculation, an isinstance branch in queue_workload, a dedicated scheduling method, and isinstance branches in dequeue/trigger logic. Each provider executor that overrode queue_workload also needed updating. This made extending the executor interface unnecessarily painful.

What this does

Replaces the per-type queue dicts and boolean capability flags with three simple primitives:

executor_queues: a single defaultdict(dict) keyed by workload type string (e.g. "ExecuteTask","ExecuteCallback") instead of separatequeued_tasks / queued_callbacks dicts
supported_workload_types: a frozenset of type strings instead of individual supports_callbacks booleans
WORKLOAD_TYPE_PRIORITY + sort_key / queue_key : properties on each workload schema that control scheduling priority and queue indexing

The base class queue_workload is now generic: validate the type, store by key. Four provider executors (K8s, ECS, Batch, Lambda) no longer need their own queue_workload overrides. trigger_tasks becomes trigger_workloads since it handles all workload types now.

Adding a new workload type after this refactor

Define queue_key and sort_key on the workload schema
Add the type string to supported_workload_types on supporting executors
Handle the type in _process_workloads, done

No changes needed in BaseExecutor itself.

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

ferruzzi

Made a real quick pass and left some comments and questions, I'll try to get a more thorough one tomorrow.

ferruzzi

Looks like my concerns were all addressed. LGTM

ashb · 2026-05-14T12:19:12Z

I'm taking a look at this now.

ashb

Thanks for tackling this!

Please add a newsfragment for the deprecated public API.

I'm also worried that an un-updated executor (either one of "our" ones where the user hasn't updated yet, or a custom one) would spam-the-living-day-light out of the logs by accessing the the deprecated queued_tasks property on every heartbeat. That needs addressing to only log once per class or once per instance I think.

ashb · 2026-05-14T12:17:43Z

    def queued_tasks(self) -> dict[TaskInstanceKey, Any]:
        """Return queued tasks from celery and kubernetes executor."""
-        return self.celery_executor.queued_tasks | self.kubernetes_executor.queued_tasks  # type: ignore[return-value]
+        queued_tasks = self.celery_executor.queued_tasks.copy()


CeleryKubernetesExecutor.queued_tasks calls the deprecated BaseExecutor.queued_tasks property on both child executors, emitting RemovedInAirflow4Warning on every access. Since this file is already being updated in this PR, please migrate these call sites to use the new API (guarded with AIRFLOW_V_3_3_PLUS for back-compat with Airflow <3.3).

This executor raises RuntimeError on Airflow 3.0+ (line 80), so this code path is unreachable on any version where executor_queues exists. We shouldn't be changing these files any more than is strictly needed to keep CI happy

ashb · 2026-05-14T12:37:19Z

        self.team_name: str | None = team_name
-        self.queued_tasks: dict[TaskInstanceKey, workloads.ExecuteTask] = {}
-        self.queued_callbacks: dict[str, workloads.ExecuteCallback] = {}
+        self.executor_queues: dict[str, dict[WorkloadKey, QueueableWorkload]] = defaultdict(dict)


Could this be a flat dict[WorkloadKey, QueueableWorkload] instead of a dict-of-dicts?

_get_workloads_to_schedule immediately flattens all sub-dicts into a single list and then sorts by (WORKLOAD_TYPE_PRIORITY, sort_key) — so priority ordering (callbacks before tasks, higher-weight tasks first) is entirely in the sort step, not in the dict structure. A flat dict would produce identical scheduling behaviour.

The only load-bearing use of the sub-dict grouping is the deprecated queued_tasks/queued_callbacks compat properties — which are on their way out. Every type-keyed deletion in providers (del self.executor_queues[WorkloadType.EXECUTE_TASK][key]) could be a plain del flat_dict[key] since WorkloadKey is unique across types.

A flat dict would simplify CeleryKubernetesExecutor.queued_tasks (no sub-dict merging), make provider-side deletion uniform, and remove the defaultdict(dict) nesting.

FWIW, the gate on line 80 of CeleryKubernetesExecutor throws a RuntimeError if Airflow version is >= 3.0, so I don't think "simplifying CeleryKubernetesExecutor" needs to factor into this decision.

That said, a flat dict isn't a bad idea in principle and likely would have been a cleaner choice in the first place. My main concern is mostly practical: Lambda (#63035), Celery (#63888), and Batch (#62984) have already merged using the executor_queues[WorkloadType.X][key] pattern, and ECS (#63657), K8s (#63454), and Edge (#63498) are all in progress implementing the same pattern. Flattening now would mean reworking all six executor implementations in addition to the changes that it would require in this PR. I'm not sure we really gain anything for that work?

How would you feel about adding a TODO to flatten it when the compat properties are removed (in 4.0??) At that point the nested structure loses its main justification anyway. Does that seem reasonable, or is that just punting the same work to Future Us?

…eat log flooding

…d in 4.0

anishgirianish · 2026-05-15T17:47:27Z

Thank you so much @ashb and @ferruzzi for the review! Latest push:

warn-once-per-class for all heartbeat-frequency compat shims (queued_tasks, queued_callbacks, supports_callbacks, trigger_tasks, order_queued_tasks_by_priority) so logs don't flood
init_subclass shim for legacy supports_callbacks = True — synthesizes supported_workload_types + warns
newsfragment added
WORKLOAD_TYPE_PRIORITY typing fixed
TODO at executor_queues for the 4.0 flat-dict migration

Two threads I'd love your steer on:

Flat dict : went with the TODO-to-4.0 compromise. Happy to flatten now if you'd rather.
CeleryKubernetesExecutor.queued_tasks : left as-is since init raises on Airflow ≥3.0 making the new branch unreachable. Happy to migrate if you'd still prefer it.

Would like to request your re-review. Will follow whichever way you'd both recommend. Thanks!

anishgirianish requested review from XD-DENG, ashb, dheerajturaga, hussein-awala, jedcunningham, jscheffl, o-nikolas and pierrejeambrun as code owners March 12, 2026 23:22

boring-cyborg Bot added area:Executors-core LocalExecutor & SequentialExecutor area:providers provider:amazon AWS/Amazon - related issues provider:celery provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:edge Edge Executor / Worker (AIP-69) / edge3 labels Mar 12, 2026

jscheffl previously requested changes Mar 12, 2026

View reviewed changes

Comment thread airflow-core/src/airflow/executors/base_executor.py Outdated

anishgirianish commented Mar 12, 2026

View reviewed changes

Comment thread airflow-core/src/airflow/executors/base_executor.py

ferruzzi mentioned this pull request Mar 12, 2026

Unify executor workload queues with tier-based scheduling #63482

Closed

1 task

anishgirianish force-pushed the refactor-workload-queue branch from 85bfce8 to 1ce0748 Compare March 12, 2026 23:39

ferruzzi reviewed Mar 12, 2026

View reviewed changes

anishgirianish force-pushed the refactor-workload-queue branch 3 times, most recently from aee94fb to 8997ee4 Compare March 13, 2026 06:02

anishgirianish marked this pull request as draft March 13, 2026 07:43

anishgirianish force-pushed the refactor-workload-queue branch 6 times, most recently from 11ee7ef to 249b014 Compare March 14, 2026 04:40

anishgirianish marked this pull request as ready for review March 14, 2026 05:45

anishgirianish force-pushed the refactor-workload-queue branch 3 times, most recently from 0f6f172 to 35a927f Compare April 18, 2026 02:03

anishgirianish marked this pull request as ready for review April 18, 2026 07:09

potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 22, 2026

eladkal requested review from kaxil and o-nikolas April 26, 2026 19:55

kaxil reviewed Apr 26, 2026

View reviewed changes

anishgirianish marked this pull request as draft April 28, 2026 07:20

anishgirianish force-pushed the refactor-workload-queue branch from 1a0f949 to b6b161f Compare May 4, 2026 00:24

anishgirianish marked this pull request as ready for review May 4, 2026 01:59

anishgirianish mentioned this pull request May 4, 2026

Add async connection testing via workers for security isolation #62343

Open

1 task

anishgirianish requested a review from kaxil May 4, 2026 07:56

ferruzzi approved these changes May 12, 2026

View reviewed changes

ferruzzi mentioned this pull request May 12, 2026

feat: add callback support to aws batch executor #62984

Merged

1 task

anishgirianish force-pushed the refactor-workload-queue branch from b6b161f to 228e71b Compare May 13, 2026 19:36

ashb requested changes May 14, 2026

View reviewed changes

anishgirianish added 7 commits May 15, 2026 11:57

Unify executor workload queues

0b1f5be

fix test failiure after rebase

a4c4094

fix failing test

ca89f41

address review comments

f0ddd66

fix test

15e90fe

consitency

3041c6e

Dedup deprecation warnings on executor compat shims to prevent heartb…

36bb7b9

…eat log flooding

anishgirianish force-pushed the refactor-workload-queue branch from 228e71b to 36bb7b9 Compare May 15, 2026 16:59

Add TODO to flatten executor_queues when compat properties are remove…

ab01230

…d in 4.0

anishgirianish requested a review from ashb May 15, 2026 17:48

ferruzzi mentioned this pull request May 15, 2026

Fix CallbackKey type for more accurate type checking #66973

Open

1 task

Conversation

anishgirianish commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Was generative AI tooling used to co-author this PR?

Summary

Problem

What this does

Adding a new workload type after this refactor

Uh oh!

Uh oh!

Uh oh!

ferruzzi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ferruzzi left a comment

Choose a reason for hiding this comment

Uh oh!

ashb commented May 14, 2026

Uh oh!

ashb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ashb May 14, 2026

Choose a reason for hiding this comment

Uh oh!

ferruzzi May 14, 2026

Choose a reason for hiding this comment

Uh oh!

ashb May 14, 2026

Choose a reason for hiding this comment

Uh oh!

ferruzzi May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anishgirianish commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

anishgirianish commented Mar 12, 2026 •

edited

Loading

anishgirianish commented May 15, 2026 •

edited

Loading