Do not deserialize trigger_kwargs when loading serialized DAGs#66002
Conversation
|
Just one test left! |
|
The changes to the
The raw JSON has Encoding enum instances as dict keys — that's how Python 3.10+ Python 3.10: On Python ≤3.10, str(Encoding.TYPE) returns "Encoding.TYPE" (the enum repr) instead of "__type" (its value), mangling the keys so the Triggerer cannot read them back. Adding |
Caused some side effects, writing a manual normalising code in |
|
@potiuk some things changed since you reviewed, do you wanna take another look? |
potiuk
left a comment
There was a problem hiding this comment.
CI is currently red on the Python 3.10 Core...Serialization jobs (Sqlite / MySQL / Postgres / LowestDeps). The failures are not just the test files modified by this PR — airflow-core/tests/unit/triggers/test_base_trigger.py::test_render_template_fields and test_render_template_fields_filters_to_trigger_kwargs are also failing with assert 'name' in () and a ValueError('not enough values to unpack (expected 2, got 1)') raised in repr(). Those tests aren't touched by this PR, which means the new shape of trigger_kwargs is leaking past defer_task into other consumers.
The premise of the change is good (skipping the deserialize-then-reserialize cycle is a real win), but the _normalize workaround in defer_task is too narrow — it only protects one path while leaving every other consumer of start_trigger_args.trigger_kwargs to discover the encoded form on its own. Two suggestions:
- Move the normalization to the
serdeboundary, e.g. insideTrigger.encrypt_kwargs, so every consumer benefits and we don't need a workaround indefer_taskat all. - Or revisit the originally-attempted
Encoding.__str__fix — fixingstr(Encoding.TYPE) == "__type"on 3.10 would make this whole class of issues go away. The thread mentioned "side effects"; would be useful to see them written down so we can decide whether to address them directly.
Drafted-by: Claude Opus 4.7; reviewed by @potiuk before posting
|
Ok, I am taking a look at this one now |
|
@dabla would appreciate your review here too |
jason810496
left a comment
There was a problem hiding this comment.
Nice! Thanks for the fix, and it LGTM overall.
Wondering do you have any idea regarding avoid the similar error like this?
It's similar to last 3.2.0 release blocker that cause by some of the nested field it not serialize well for BaseEventTrigger
Perhaps checking if there's any field not serialize properly (e.g. regex with <XXX object at 0x0000...> format) with different case of triggers? (I don't have any concrete thought now, just thinking about how to catch this kind of error in the first place)
|
@jason810496 Good catch on that gap, it seems that the hash test only catches asymmetric breakage, not the case where both sides serialize equally wrong. Tracking adding a |
Yeah, I tracked in #66413 for the follow-up, anyone from the community can give a shot. |
|
@ephraimbuddy can you take another look? |
03a2817 to
a8eb624
Compare
Was generative AI tooling used to co-author this PR?
After #55068 was merged, inconsequentially,
_decode_start_trigger_argswas deserializingtrigger_kwargsandnext_kwargswhen loading a serialized DAG. These fields hold the serialized trigger state that only the Triggerer needs — when it picks up a deferred task, inflates the kwargs, and instantiates the trigger class.The Scheduler and API Server load serialized DAGs but never touch these values; deserializing them there is wasted work.
This PR removes the deserialization and keeps
trigger_kwargsandnext_kwargsas raw JSON onStartTriggerArgs. The Triggerer reads trigger kwargs directly from the Trigger DB row (not through DAG deserialization), so its path is unaffected:airflow-core/src/airflow/models/trigger.py#L170-L179What changes?
Why the changes to enum.py
#66002 (comment)
Scheduler
Before:
After:
The
Trigger.__init__callsserde.serialize()on whatever it receives, so deserializing then re-serializing was pure waste.API Server
Before: loads serialized DAG →
_decode_start_trigger_args()which deserializestrigger_kwargsinto Python objects → never used, thrown away.After: same but stays raw JSON. No functional difference, just skips the work.
Triggerer
Before and after:
The triggerer also loads the serialized DAG at https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/jobs/triggerer_job_runner.py#L717-L735 to check start_from_trigger. So it does go through
_decode_start_trigger_args()too, but it does not use trigger_kwargs from that — it readstrigger.encrypted_kwargsfrom the DB row directly below. So the fix is safe for the triggerer as well.{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.