Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow 3.0 deprecations #14626

Open
2 tasks
dhruvmanila opened this issue Nov 27, 2024 · 15 comments
Open
2 tasks

Airflow 3.0 deprecations #14626

dhruvmanila opened this issue Nov 27, 2024 · 15 comments
Labels
help wanted Contributions especially welcome rule Implementing or modifying a lint rule

Comments

@dhruvmanila
Copy link
Member

dhruvmanila commented Nov 27, 2024

Tracking issue for the AIR3xx rules that implements Airflow 3.0 deprecations.

For details, refer to the Airflow tracking issue: apache/airflow#44556

Help wanted

The following is a "help wanted" task list:

  • Add infrastructure to provide an auto-fix for the migration rules. Look at NumPy 2.0 implementation for reference
  • Add a guard specifically for Airflow imports and attributes where a try .. except block could include both the new and deprecated symbols for backwards compatibility. Look at NumPy 2.0 implementation for reference

Ref: apache/airflow#41641

@Lee-W
Copy link
Contributor

Lee-W commented Nov 28, 2024

There's already a short proposal #14616 (comment). Putting it here might make it easier to find 🙂


I'm thinking of separating "removal" and "rename" at least. It would be much easier for the users to know whether there's anything they could do to fix it.

@dhruvmanila
Copy link
Member Author

Additionally, the NumPy 2.0 migration rule is a good reference whose implementation is at: https://github.com/astral-sh/ruff/blob/6f1cf5b686e67fb74383641fcbbe98e33481a271/crates/ruff_linter/src/rules/numpy/rules/numpy_2_0_deprecation.rs

@Lee-W
Copy link
Contributor

Lee-W commented Nov 29, 2024

Hi, we have grouped our deprecations into several categories and would appreciate it if the team could take a look and let us know what you think about it 🙂

Here's the categorised list we have now
apache/airflow#41641 (comment)
(there's another group we temporarily mark as AIR305, but we'll need some time to figure out what we can do for those. Will likely to be merged into 302 or 303)

@MichaReiser
Copy link
Member

I like what's proposed here. Thanks for going through all the deprecations and grouping them by possible rules. We can also do some fine-tuning as we go.

@Lee-W
Copy link
Contributor

Lee-W commented Dec 3, 2024

Later discuss with @uranusjr, we think it would be better for us to merge rename and removal into 302. So the current idea would be

We now use apache/airflow#44556 to track the detail of this task instead of the original one as some of the migration might not be able to be done in ruff

@MichaReiser
Copy link
Member

Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.

@Lee-W
Copy link
Contributor

Lee-W commented Dec 4, 2024

Sorry for pinging you here, but could someone more familiar with Airflow volunteer looking at the changes in #14631? We just want to make sure that the change is correct.

Looks like a valid one to me

@MichaReiser
Copy link
Member

Thanks for taking a look!

@dhruvmanila
Copy link
Member Author

I think it'll also be really useful to provide an auto-fix for these rules. Refer to how NumPy 2.0 deprecation rule does it:

match replacement.details {
Details::AutoImport {
path,
name,
compatibility,
} => {
diagnostic.try_set_fix(|| {
let (import_edit, binding) = checker.importer().get_or_import_symbol(
&ImportRequest::import_from(path, name),
expr.start(),
checker.semantic(),
)?;
let replacement_edit = Edit::range_replacement(binding, expr.range());
Ok(match compatibility {
Compatibility::BackwardsCompatible => {
Fix::safe_edits(import_edit, [replacement_edit])
}
Compatibility::Breaking => Fix::unsafe_edits(import_edit, [replacement_edit]),
})
});
}
Details::AutoPurePython { python_expr } => diagnostic.set_fix(Fix::safe_edit(
Edit::range_replacement(python_expr.to_string(), expr.range()),
)),
Details::Manual { guideline: _ } => {}
};

And, I think we'll also need to add this guard check specific to Airflow imports and attributes which allows for having both the new and deprecated symbols to exists in a try .. except .. block:

if is_guarded_by_try_except(expr, &replacement, semantic) {
return;
}

@uranusjr
Copy link
Contributor

uranusjr commented Dec 9, 2024

I agree. We’re on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.

@dhruvmanila
Copy link
Member Author

We’re on a tight schedule and would like to prioritise adding all the rules first, but contribution is always welcomed on these.

Yeah, I think that's fine. Do you have a rough timeline on when v3 is going to be released? I could spend some time in adding the infrastructure for an auto-fix. I think that shouldn't take much time as we already have a reference implementation. And, I'll mark this as a "help wanted" task in the issue description if someone else is interested.

@dhruvmanila dhruvmanila added the help wanted Contributions especially welcome label Dec 9, 2024
@tirkarthi
Copy link
Contributor

It seems ruff also has interface to add fixes which seems straightforward. I tested below where users can opt for unsafe fixes using --fix and also see --diff before applying them. This fixes some of the common keyword argument related issues. I can raise a PR. This will help users to do migrations using ruff with in place changes for the dag files once they confirm the changes through diffs in large deployments. Sample output as below

Diff : main...tirkarthi:ruff:air302-fixes

cargo run -p ruff -- check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache
   Compiling ruff v0.8.2 (/home/karthikeyan/stuff/python/ruff/crates/ruff)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 11.86s
     Running `target/debug/ruff check --preview --select=AIR302 --unsafe-fixes --diff example_latest_only.py --no-cache`
--- example_latest_only.py
+++ example_latest_only.py
@@ -27,7 +27,7 @@
 
 with DAG(
     dag_id="latest_only",
-    schedule_interval="daily",
+    schedule="daily",
     start_date=datetime.datetime(2021, 1, 1),
     catchup=False,
     tags=["example2", "example3"],

Would fix 1 error.

@dhruvmanila
Copy link
Member Author

@tirkarthi Thank you for working on that. I think it will be useful to consider the kind of deprecations and provide the auto-fixes accordingly. Your implementation looks simple enough, I'm happy to look at the PR. One suggestion for the None case would be to use the remove_argument function as there's no replacement for that. We could even rename Replacement::None to Replacement::Remove to make the intentions clear.

@tirkarthi
Copy link
Contributor

Thanks @dhruvmanila, created PR #14887 . I agree that the fixes need to be categorized like numpy with Replacement enum updated with more details but I guess it will be a bigger effort and will see if I can tackled it in a separate PR.

@Lee-W
Copy link
Contributor

Lee-W commented Dec 10, 2024

Just created a new one for AIR303 #14764

dhruvmanila added a commit that referenced this issue Dec 10, 2024
…#14887)

## Summary

Add replacement fixes to deprecated arguments of a DAG.

Ref #14582 #14626

## Test Plan

Diff was verified and snapshots were updated.

---------

Co-authored-by: Dhruv Manilawala <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Contributions especially welcome rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

5 participants