Question - using smaller DAGs as pieces of larger DAGs #13953
-
Hi everyone, I'm using Airflow to orchestrate my ETLs, and I have a question on best practices. Basically I have an ETL composed of, let's say, 5 sequential SQL queries. I am going to be triggering this ETL independently every once in a while, but I also want to include it as a part of a larger ETL, in parallel with other similar sub-ETLs. I would like to be able to:
I came across the SubDag Airflow module, but the usage is pretty confusing to me (having to use a DAG factory, having to set the DAG Id to ".", etc.), and it seems to be recommended for repetitive tasks, and not to compose DAGs from smaller ones. Has anyone faced this conceptual problem before? Any tips would be much appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
I think you may be interested in Also, I would consider creating for each of those queries a function that returns an instance of operator. Then you can simply invoke them in any DAG and easily reuse them 🤔 Also if you are using only PythonOperator you may be interested in: |
Beta Was this translation helpful? Give feedback.
I think you may be interested in
TaskGroups
:http://airflow.apache.org/docs/apache-airflow/stable/concepts.html#taskgroup
Also, I would consider creating for each of those queries a function that returns an instance of operator. Then you can simply invoke them in any DAG and easily reuse them 🤔
Also if you are using only PythonOperator you may be interested in:
http://airflow.apache.org/docs/apache-airflow/stable/concepts.html#taskflow-api