Skip to content

Does it really matter if you import numpy when it will be in sys.modules after the first parse? #33737

Answered by potiuk
karlkovaciny asked this question in Q&A
Discussion options

You must be logged in to vote

Every DAG in Airflow is parsed in a separately forked process. This is in order to achieve a) isolation b) reloading classes every time import is made to make sure we load latest version of imported files c) in order to not crash whole DAG file processor in case - for whatever reason the import will fail (even with errors like SIGSEGV) - only the forked process will get killed and main process will continue forking processes to parse DAG files.

This means that whatever is imported in DAG parsed by DAG file processor is only cached in that forked process and is discarded once parsing of that individual file completes (the processes exit after the DAG is serialized to json form and saveed t…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@potiuk
Comment options

potiuk Aug 25, 2023
Collaborator

@karlkovaciny
Comment options

@potiuk
Comment options

potiuk Aug 26, 2023
Collaborator

Answer selected by karlkovaciny
@karlkovaciny
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants