Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update conf column in dag_run table type from bytes to JSON #44533

Merged

Conversation

vatsrahul1001
Copy link
Collaborator

closes: #43933


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@vatsrahul1001 vatsrahul1001 marked this pull request as draft December 1, 2024 13:11
@vatsrahul1001 vatsrahul1001 marked this pull request as ready for review December 2, 2024 10:32
@vatsrahul1001 vatsrahul1001 marked this pull request as draft December 2, 2024 10:33
@vatsrahul1001
Copy link
Collaborator Author

While Testing this I noticed an issue in the downgrade case. We are not removing the data from downgrade and just perform below conversion.

          ALTER TABLE dag_run
          ALTER COLUMN conf TYPE BYTEA
          USING CASE
              WHEN conf IS NOT NULL THEN CONVERT_TO(conf::TEXT, 'UTF8')
              ELSE NULL
          END
 

I am getting UnpicklingError error with this when I try to open DAG page.
image

Should we also move data to the archive table in case of a downgrade as well?
cc: @kaxil

@vatsrahul1001 vatsrahul1001 marked this pull request as ready for review December 2, 2024 13:23
@kaxil
Copy link
Member

kaxil commented Dec 2, 2024

While Testing this I noticed an issue in the downgrade case. We are not removing the data from downgrade and just perform below conversion.

          ALTER TABLE dag_run
          ALTER COLUMN conf TYPE BYTEA
          USING CASE
              WHEN conf IS NOT NULL THEN CONVERT_TO(conf::TEXT, 'UTF8')
              ELSE NULL
          END
 

I am getting UnpicklingError error with this when I try to open DAG page. image

Should we also move data to the archive table in case of a downgrade as well? cc: @kaxil

Yeah, it is different than XCom, because XCom code had handling of both JSON & pickle type -- for dagrun conf it would be different. For downgrade you might just want to insert all records from the archive table back here.

@vatsrahul1001 vatsrahul1001 added the airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes label Dec 3, 2024
kaxil pushed a commit that referenced this pull request Dec 3, 2024
In [PR](#44166) we added migration for removing pickled data from `xcom` table. During my testing I noticed with `SQLite`  [insert](https://github.com/apache/airflow/blob/main/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py#L88) statement is not working in case of upgrade. Changing condition to `hex(substr(value, 1, 1)) = '80'` works. Tested [here](#44533 (comment)).

related: #44166
LefterisXefteris pushed a commit to LefterisXefteris/airflow that referenced this pull request Jan 5, 2025
In [PR](apache#44166) we added migration for removing pickled data from `xcom` table. During my testing I noticed with `SQLite`  [insert](https://github.com/apache/airflow/blob/main/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py#L88) statement is not working in case of upgrade. Changing condition to `hex(substr(value, 1, 1)) = '80'` works. Tested [here](apache#44533 (comment)).

related: apache#44166
@ephraimbuddy ephraimbuddy dismissed their stale review January 7, 2025 12:35

To allow offline migration to work would require creating an extension in postgres that converst pickled data to json which I think would be a lot to do in this PR. I would probably look at this another time but it shouldn't block this PR

@ephraimbuddy ephraimbuddy merged commit db132fb into apache:main Jan 13, 2025
94 of 95 checks passed
@ephraimbuddy ephraimbuddy deleted the dag_run_migrate_conf_column_as_json branch January 13, 2025 14:01
karenbraganz pushed a commit to karenbraganz/airflow that referenced this pull request Jan 13, 2025
…4533)

* remove pickled data from dag run table

* fix downgrade + add news fragement

* remove archive table if exits after downgrade

* removing archiving data

* fixing static check

* fixing static checks

* simplying upgrade and downgrade as per review

* fixing failures

* removing setting conf to null

* refactor approach to migrate values in conf

* update offline warning

* resolving conflicts

* resolving conflicts

* resolving conflicts

* updating batch size

* updaing conf type

---------

Co-authored-by: Jed Cunningham <[email protected]>
HariGS-DB pushed a commit to HariGS-DB/airflow that referenced this pull request Jan 16, 2025
…4533)

* remove pickled data from dag run table

* fix downgrade + add news fragement

* remove archive table if exits after downgrade

* removing archiving data

* fixing static check

* fixing static checks

* simplying upgrade and downgrade as per review

* fixing failures

* removing setting conf to null

* refactor approach to migrate values in conf

* update offline warning

* resolving conflicts

* resolving conflicts

* resolving conflicts

* updating batch size

* updaing conf type

---------

Co-authored-by: Jed Cunningham <[email protected]>
dauinh pushed a commit to dauinh/airflow that referenced this pull request Jan 24, 2025
…4533)

* remove pickled data from dag run table

* fix downgrade + add news fragement

* remove archive table if exits after downgrade

* removing archiving data

* fixing static check

* fixing static checks

* simplying upgrade and downgrade as per review

* fixing failures

* removing setting conf to null

* refactor approach to migrate values in conf

* update offline warning

* resolving conflicts

* resolving conflicts

* resolving conflicts

* updating batch size

* updaing conf type

---------

Co-authored-by: Jed Cunningham <[email protected]>
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
In [PR](apache#44166) we added migration for removing pickled data from `xcom` table. During my testing I noticed with `SQLite`  [insert](https://github.com/apache/airflow/blob/main/airflow/migrations/versions/0049_3_0_0_remove_pickled_data_from_xcom_table.py#L88) statement is not working in case of upgrade. Changing condition to `hex(substr(value, 1, 1)) = '80'` works. Tested [here](apache#44533 (comment)).

related: apache#44166
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
…4533)

* remove pickled data from dag run table

* fix downgrade + add news fragement

* remove archive table if exits after downgrade

* removing archiving data

* fixing static check

* fixing static checks

* simplying upgrade and downgrade as per review

* fixing failures

* removing setting conf to null

* refactor approach to migrate values in conf

* update offline warning

* resolving conflicts

* resolving conflicts

* resolving conflicts

* updating batch size

* updaing conf type

---------

Co-authored-by: Jed Cunningham <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:db-migrations PRs with DB migration full tests needed We need to run full set of tests for this PR to merge kind:documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dag_run table - replace column conf type - byte ( that store a python pickle ) by a JSON
7 participants