astral-sh/ruff

Airflow 3.0 deprecations

Opened this issue Β· 15 comments

Tracking issue for the AIR3xx rules that implements Airflow 3.0 deprecations.

For details, refer to the Airflow tracking issue: apache/airflow#44556

Help wanted

The following is a "help wanted" task list:

  • Add infrastructure to provide an auto-fix for the migration rules. Look at NumPy 2.0 implementation for reference
  • Add a guard specifically for Airflow imports and attributes where a try .. except block could include both the new and deprecated symbols for backwards compatibility. Look at NumPy 2.0 implementation for reference

Ref: apache/airflow#41641

Lee-W commented

There's already a short proposal #14616 (comment). Putting it here might make it easier to find πŸ™‚


I'm thinking of separating "removal" and "rename" at least. It would be much easier for the users to know whether there's anything they could do to fix it.

Lee-W commented

Hi, we have grouped our deprecations into several categories and would appreciate it if the team could take a look and let us know what you think about it πŸ™‚

  • already merged
    • AIR301: Avoid implicit DAG schedule #14581
  • newly proposed error number
    • AIR302: removal or no direct mapping
    • AIR303: rename
      • It would be much easier for the users to know whether they could do anything to fix it.
    • AIR304: moved to airflow providers
      • for these, they're similar to "rename" but require the user to install an additional package (e.g., apache-airflow-providers-fab)

Here's the categorised list we have now
apache/airflow#41641 (comment)
(there's another group we temporarily mark as AIR305, but we'll need some time to figure out what we can do for those. Will likely to be merged into 302 or 303)

I like what's proposed here. Thanks for going through all the deprecations and grouping them by possible rules. We can also do some fine-tuning as we go.

Lee-W commented

Later discuss with @uranusjr, we think it would be better for us to merge rename and removal into 302. So the current idea would be

  • already merged
    • AIR301: Avoid implicit DAG schedule (#14581)
    • AIR302: removal (#14582)
  • new rule number
    • AIR303: moved to airflow providers
      • for these, they're similar to "rename" but require the user to install an additional package (e.g., apache-airflow-providers-fab)

We now use apache/airflow#44556 to track the detail of this task instead of the original one as some of the migration might not be able to be done in ruff

Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.

Lee-W commented

Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.

Looks like a valid one to me

Thanks for taking a look!

I think it'll also be really useful to provide an auto-fix for these rules. Refer to how NumPy 2.0 deprecation rule does it:

match replacement.details {
Details::AutoImport {
path,
name,
compatibility,
} => {
diagnostic.try_set_fix(|| {
let (import_edit, binding) = checker.importer().get_or_import_symbol(
&ImportRequest::import_from(path, name),
expr.start(),
checker.semantic(),
)?;
let replacement_edit = Edit::range_replacement(binding, expr.range());
Ok(match compatibility {
Compatibility::BackwardsCompatible => {
Fix::safe_edits(import_edit, [replacement_edit])
}
Compatibility::Breaking => Fix::unsafe_edits(import_edit, [replacement_edit]),
})
});
}
Details::AutoPurePython { python_expr } => diagnostic.set_fix(Fix::safe_edit(
Edit::range_replacement(python_expr.to_string(), expr.range()),
)),
Details::Manual { guideline: _ } => {}
};

And, I think we'll also need to add this guard check specific to Airflow imports and attributes which allows for having both the new and deprecated symbols to exists in a try .. except .. block:

if is_guarded_by_try_except(expr, &replacement, semantic) {
return;
}

I agree. We’re on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.

We’re on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.

Yeah, I think that's fine. Do you have a rough timeline on when v3 is going to be released? I could spend some time in adding the infrastructure for an auto-fix. I think that shouldn't take much time as we already have a reference implementation. And, I'll mark this as a "help wanted" task in the issue description if someone else is interested.

It seems ruff also has interface to add fixes which seems straightforward. I tested below where users can opt for unsafe fixes using --fix and also see --diff before applying them. This fixes some of the common keyword argument related issues. I can raise a PR. This will help users to do migrations using ruff with in place changes for the dag files once they confirm the changes through diffs in large deployments. Sample output as below

Diff : main...tirkarthi:ruff:air302-fixes

cargo run -p ruff -- check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache
   Compiling ruff v0.8.2 (/home/karthikeyan/stuff/python/ruff/crates/ruff)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 11.86s
     Running `target/debug/ruff check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache`
--- example_latest_only.py
+++ example_latest_only.py
@@ -27,7 +27,7 @@
 
 with DAG(
     dag_id="latest_only",
-    schedule_interval="daily",
+    schedule="daily",
     start_date=datetime.datetime(2021, 1, 1),
     catchup=False,
     tags=["example2", "example3"],

Would fix 1 error.

@tirkarthi Thank you for working on that. I think it will be useful to consider the kind of deprecations and provide the auto-fixes accordingly. Your implementation looks simple enough, I'm happy to look at the PR. One suggestion for the None case would be to use the remove_argument function as there's no replacement for that. We could even rename Replacement::None to Replacement::Remove to make the intentions clear.

Thanks @dhruvmanila, created PR #14887 . I agree that the fixes need to be categorized like numpy with Replacement enum updated with more details but I guess it will be a bigger effort and will see if I can tackled it in a separate PR.

Lee-W commented

Just created a new one for AIR303 #14764