Retry for TaskGroup #21333
Replies: 6 comments 12 replies
-
This is a discussion, not an issue - please open "GitHub Discussion" next time. I hope someone who has an experience with operating Airlfow will help you with answers. |
Beta Was this translation helpful? Give feedback.
-
Soo... no ideas? Am I doing it right? |
Beta Was this translation helpful? Give feedback.
-
+1 this feature would be very useful. |
Beta Was this translation helpful? Give feedback.
-
Why not use |
Beta Was this translation helpful? Give feedback.
-
@victorfuzaro @sartyukhov I'm wondering how do you expect to with TaskGroup(..., retries=4) as tg:
A = DummyOperator(task_id="A", retries=3)
B = DummyOperator(task_id="B")
A >> B The simples scenario where a) Should we allow 3 retries of Probably a) sounds reasonable but it may affect db state of TaskInstances because we will end up with |
Beta Was this translation helpful? Give feedback.
-
Any news on this? Would really love to see this. I have a peculiar use case for repeating entire TaskGroups. I run a lot of scraping pipelines. Those are typically followed by parsing and translation steps, which fit the retrieved data into our own internal schema. Occasionally, the data itself is corrupted because the service is down intermittently. As a result, all downstream tasks should fail. However, we do not validate the data at the scraping step, because we essentially collect all the content into a WARC (internet archive) file so that we can validate the data downstream. In cases like these, the retry should retry the scraping as well, even though technically the scraping task was deemed a There are two options here:
Option 2 is what I believe is being described above, and is the preferred option. |
Beta Was this translation helpful? Give feedback.
-
Description
Hello!
Previously, a SubDag was used to organize tasks into groups. Now you've introduced a TaskGroups to the world .
It's nice and very clever. But it has a one big disadvantage over the SubDag - it cant be repeated.
Use case/motivation
For example:
In a project I have two task (A >> B):
A - collect data (PythonOperator)
B - update material view in postgres (PostgresOperator)
'A' could collect only part of data and mark itself as failed (there is no "half-failed" status as I know). But task 'B' should run regardless of A`s result (trigger_rule="all_done" for example) to update matview with part of data.
In an ~ hour I would like to repeat that process (A >> B).
With SubDag I could do that:
and that's it, C marks dag as failed and trigger it to retry.
But TaskGroup does not have retry parameter.
I also can't retry whole DAG, because it's big.
I also don't want to update material view inside task 'A' because in that way I can't do [A0, A1..An] >> B (update material view just once for several collects).
I hope it's possible. Or maybe it could be done some other way.
Thanks in advance.
Related issues
No response
Are you willing to submit a PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions