Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Pydanitc models introduced for AIP-44 #44552

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Dec 1, 2024

The Pudanic models have been used in a number of places still and we are using them also for context passing for PythonVirtualEnv and ExternaPythonOperator - this PR removes all the models and their usages.

Closes: #44436


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Dec 1, 2024

It's not likely this one will be green, but it's likely it will show which parts have to be fixed because they used implicitly Pydantic Models synchronisation.

But generally speaking it is ready to take a first pass of review.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 5 out of 12 changed files in this pull request and generated no suggestions.

Files not reviewed (7)
  • airflow/serialization/pydantic/init.py: Evaluated as low risk
  • airflow/serialization/pydantic/asset.py: Evaluated as low risk
  • airflow/serialization/pydantic/dag.py: Evaluated as low risk
  • airflow/serialization/pydantic/dag_run.py: Evaluated as low risk
  • airflow/serialization/pydantic/job.py: Evaluated as low risk
  • airflow/serialization/pydantic/taskinstance.py: Evaluated as low risk
  • airflow/serialization/pydantic/tasklog.py: Evaluated as low risk
Comments skipped due to low confidence (1)

airflow/models/taskinstance.py:1

  • The function _get_previous_dagrun_success is missing the task_instance and session parameters. It should be defined as def _get_previous_dagrun_success(task_instance, session) -> DagRun | None:.
def _get_previous_dagrun_success() -> DagRun | None:
@potiuk potiuk force-pushed the remove-aip-44-pydantic-models branch from 0470905 to 3919b89 Compare December 2, 2024 00:00
@potiuk
Copy link
Member Author

potiuk commented Dec 2, 2024

BTW. This is the last one in the #44436 series !

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small leftover with DagTag - but otherwise looking good!

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

I will not say that I'm glad to see all that gone, but I kind of am :p. Working with DagRun | DagRunPydantic | None and similars wasn't super easy

@dstandish
Copy link
Contributor

happy we're removing use_pydantic_models

@potiuk
Copy link
Member Author

potiuk commented Dec 3, 2024

happy we're removing use_pydantic_models

Yeah ... I had no time to look at the failures today ... but tomorrow 🤞 is the day :). I just feel that there will be some left-overs with the context and some TODOs to fix - it's worth to do it anyway :)

@potiuk potiuk force-pushed the remove-aip-44-pydantic-models branch 2 times, most recently from bcbbb4f to a685255 Compare December 7, 2024 15:25
@potiuk
Copy link
Member Author

potiuk commented Dec 8, 2024

Speaking of which. I think we have to agree on Pydantic strategy here.

There are two reasons why this one is "red"

  • as discussed before - we are passing Context in PythonVirtualenv operator to virtualenv when use_airflow_context=True - which is likely not good idea for Standard provider.

So far it was really a "future" feature - only working with Airflow 3 and in Airflow 2 it was only working with AIP-44 enabled (which was never a production option, except a few crazy people (cc: @jscheffl ;). So we could possibly even drop it. Or we can likely serialize it using the same mechanism serialization will happen for TaskSDK (@kaxil @ashb)? What would it be? I looked at the datamodels in task_sdk and have not seen context autogenerated yet (did I look in the right place?) maybe it's somewhere else.
So far context was serializable with assumption that AIP-44 is enabled and the "ORM" objects were serialized to their Pydantic counterparts - the whole contex serialization was added here #38567 by @dstandish and it assumed all the ORM models that were part of the context were serializable (this will stop being true once we remove the `*Pydantic classes' )

What plans do you have for for context serialization ?

  • also current Fast API implementation uses DagTagPydantic as serialization layer. I am not sure if we can do it differently @pierrejeambrun but that sounds like there should be an easy way to do so?

@potiuk potiuk force-pushed the remove-aip-44-pydantic-models branch from a685255 to a93e3ad Compare December 8, 2024 17:46
@pierrejeambrun
Copy link
Member

pierrejeambrun commented Dec 10, 2024

also current Fast API implementation uses DagTagPydantic as serialization layer. I am not sure if we can do it differently @pierrejeambrun but that sounds like there should be an easy way to do so?

We just need to copy into the api_fastapi/core_api/datamodels the DagTagPydantic. (and rename it into DagTagResponse, those are specific RestAPI serializers). The API has it own serializers, but this one was common so I re-used it, but I'm not sure it was even a good idea in the first place. (Now that all those are going away there's no hesitation anymore 😀)

@potiuk
Copy link
Member Author

potiuk commented Dec 10, 2024

So from those two responses:

We haven't tackled context yet, but these models can be removed and we'll add back the ones we need in the execution SDK (some will be very different, some similar) so merge when ever you are happy with this

We just need to copy into the api_fastapi/core_api/datamodels the DagTagPydantic. (and rename it into DagTagResponse, those are specific RestAPI serializers). The API has it own serializers, but this one was common so I re-used it, but I'm not sure it was even a good idea in the first place. (Now that all those are going away there's no hesitation anymore 😀)

Thanks for those answers!

Let me try to complete this one then. I will leave the "use_airflow_context" as a feature of Python Virtualenv that only (will) work in Airflow 3 and leave to DO to add context propagation once we have it done for AIP-72.

And we should be able to close this one.

@potiuk potiuk force-pushed the remove-aip-44-pydantic-models branch from a93e3ad to 5c8c85c Compare December 12, 2024 15:28
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks.

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label Dec 12, 2024
@potiuk potiuk closed this Dec 12, 2024
@potiuk potiuk reopened this Dec 12, 2024
@potiuk
Copy link
Member Author

potiuk commented Dec 12, 2024

Reopen with full-tests-needed

@potiuk potiuk force-pushed the remove-aip-44-pydantic-models branch 2 times, most recently from 39960b3 to b0dea9b Compare December 12, 2024 17:02
The Pudanic models have been used in a number of places still and
we are using them also for context passing for PythonVirtualEnv
and ExternaPythonOperator  - this PR removes all the models and
their usages.

Closes: apache#44436

 # Please enter the commit message for your changes. Lines starting
@potiuk
Copy link
Member Author

potiuk commented Dec 12, 2024

Ok. Looks like Pydantic classes removal is ready to merge - any last pass? I moved the DagTag to fast_api models, and added error messages/conditional tests/ left TODOs to bring use_airflow_context when we have serializable context for AIP-72.

If anyone want to have last pass -> feel free.

Copy link
Contributor

@jscheffl jscheffl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boom. Let is dissolve :-D

Copy link
Member

@gopidesupavan gopidesupavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally :)

@potiuk potiuk merged commit 8af1bbd into apache:main Dec 12, 2024
97 checks passed
@potiuk potiuk deleted the remove-aip-44-pydantic-models branch December 12, 2024 21:36
ellisms pushed a commit to ellisms/airflow that referenced this pull request Dec 13, 2024
The Pudanic models have been used in a number of places still and
we are using them also for context passing for PythonVirtualEnv
and ExternaPythonOperator  - this PR removes all the models and
their usages.

Closes: apache#44436

 # Please enter the commit message for your changes. Lines starting
LefterisXefteris pushed a commit to LefterisXefteris/airflow that referenced this pull request Jan 5, 2025
The Pudanic models have been used in a number of places still and
we are using them also for context passing for PythonVirtualEnv
and ExternaPythonOperator  - this PR removes all the models and
their usages.

Closes: apache#44436

 # Please enter the commit message for your changes. Lines starting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:serialization full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Removal of AIP-44 code
8 participants