Airflow 3.0 deprecations
Opened this issue Β· 15 comments
Tracking issue for the AIR3xx
rules that implements Airflow 3.0 deprecations.
For details, refer to the Airflow tracking issue: apache/airflow#44556
Help wanted
The following is a "help wanted" task list:
- Add infrastructure to provide an auto-fix for the migration rules. Look at NumPy 2.0 implementation for reference
- Add a guard specifically for Airflow imports and attributes where a
try .. except
block could include both the new and deprecated symbols for backwards compatibility. Look at NumPy 2.0 implementation for reference
Ref: apache/airflow#41641
There's already a short proposal #14616 (comment). Putting it here might make it easier to find π
I'm thinking of separating "removal" and "rename" at least. It would be much easier for the users to know whether there's anything they could do to fix it.
Additionally, the NumPy 2.0 migration rule is a good reference whose implementation is at: https://github.com/astral-sh/ruff/blob/6f1cf5b686e67fb74383641fcbbe98e33481a271/crates/ruff_linter/src/rules/numpy/rules/numpy_2_0_deprecation.rs
Hi, we have grouped our deprecations into several categories and would appreciate it if the team could take a look and let us know what you think about it π
- already merged
- AIR301: Avoid implicit DAG schedule #14581
- newly proposed error number
- AIR302: removal or no direct mapping
- following #14582
- AIR303: rename
- It would be much easier for the users to know whether they could do anything to fix it.
- AIR304: moved to airflow providers
- for these, they're similar to "rename" but require the user to install an additional package (e.g.,
apache-airflow-providers-fab
)
- for these, they're similar to "rename" but require the user to install an additional package (e.g.,
- AIR302: removal or no direct mapping
Here's the categorised list we have now
apache/airflow#41641 (comment)
(there's another group we temporarily mark as AIR305, but we'll need some time to figure out what we can do for those. Will likely to be merged into 302 or 303)
I like what's proposed here. Thanks for going through all the deprecations and grouping them by possible rules. We can also do some fine-tuning as we go.
Later discuss with @uranusjr, we think it would be better for us to merge rename and removal into 302
. So the current idea would be
- already merged
- new rule number
- AIR303: moved to airflow providers
- for these, they're similar to "rename" but require the user to install an additional package (e.g., apache-airflow-providers-fab)
- AIR303: moved to airflow providers
We now use apache/airflow#44556 to track the detail of this task instead of the original one as some of the migration might not be able to be done in ruff
Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.
Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.
Looks like a valid one to me
Thanks for taking a look!
I think it'll also be really useful to provide an auto-fix for these rules. Refer to how NumPy 2.0 deprecation rule does it:
ruff/crates/ruff_linter/src/rules/numpy/rules/numpy_2_0_deprecation.rs
Lines 683 to 708 in 1bd8fbb
And, I think we'll also need to add this guard check specific to Airflow imports and attributes which allows for having both the new and deprecated symbols to exists in a try .. except ..
block:
ruff/crates/ruff_linter/src/rules/numpy/rules/numpy_2_0_deprecation.rs
Lines 671 to 673 in 1bd8fbb
I agree. Weβre on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.
Weβre on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.
Yeah, I think that's fine. Do you have a rough timeline on when v3 is going to be released? I could spend some time in adding the infrastructure for an auto-fix. I think that shouldn't take much time as we already have a reference implementation. And, I'll mark this as a "help wanted" task in the issue description if someone else is interested.
It seems ruff also has interface to add fixes which seems straightforward. I tested below where users can opt for unsafe fixes using --fix
and also see --diff
before applying them. This fixes some of the common keyword argument related issues. I can raise a PR. This will help users to do migrations using ruff with in place changes for the dag files once they confirm the changes through diffs in large deployments. Sample output as below
Diff : main...tirkarthi:ruff:air302-fixes
cargo run -p ruff -- check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache
Compiling ruff v0.8.2 (/home/karthikeyan/stuff/python/ruff/crates/ruff)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 11.86s
Running `target/debug/ruff check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache`
--- example_latest_only.py
+++ example_latest_only.py
@@ -27,7 +27,7 @@
with DAG(
dag_id="latest_only",
- schedule_interval="daily",
+ schedule="daily",
start_date=datetime.datetime(2021, 1, 1),
catchup=False,
tags=["example2", "example3"],
Would fix 1 error.
@tirkarthi Thank you for working on that. I think it will be useful to consider the kind of deprecations and provide the auto-fixes accordingly. Your implementation looks simple enough, I'm happy to look at the PR. One suggestion for the None
case would be to use the remove_argument
function as there's no replacement for that. We could even rename Replacement::None
to Replacement::Remove
to make the intentions clear.
Thanks @dhruvmanila, created PR #14887 . I agree that the fixes need to be categorized like numpy with Replacement
enum updated with more details but I guess it will be a bigger effort and will see if I can tackled it in a separate PR.