DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour #39239

jorisvandenbossche · 2021-01-17T20:15:01Z

This is obviously a last-minute change, but if people agree on the deprecation, I think we should try to include it in v1.2.1. I think my patch is relatively safe, since converting the input to numpy arrays (what I am doing now manually as fallback) is what happened before adding DataFrame.__array_ufunc__ as well.

It adds quite some lines of code, but it's mostly some simple checking of the exact case which is a bit verbose.

The specific tests I added were verified to pass on pandas 1.1.5, so they codify the previous behaviour (minus the warnings).

cc @TomAugspurger

…aligned + fallback to <1.2.0 behaviour

doc/source/whatsnew/v1.2.1.rst

pandas/core/arraylike.py

doc/source/whatsnew/v1.2.1.rst

jreback

this is not timely
pls wait for 1.2.2 if you must

jorisvandenbossche · 2021-01-18T13:50:38Z

@jreback could you at least read what it is about and give your opinion about that? Because it is reverting behavior that was changed in 1.2.0, it exactly is timely to do it for 1.2.1

jreback

i have to say i am not against the deprecation itself

i like the change and these should align - and i suppose can be deprecated

but the timing is terrible and this is a large amount of code

pls just wait till 1.2.2

jreback · 2021-01-18T13:53:12Z

it is not timely to do things at the last minute

we have had several last minute changes in the past which have been a disaster

-1000 on merging this now

jorisvandenbossche · 2021-01-18T14:07:26Z

Thanks for the feedback @simonjayhawkins, pushed an update

simonjayhawkins · 2021-01-18T14:09:29Z

we are pretty much on top of the regressions from 1.2, so if a short delay enables to clear the list, it may be worth considering.

but agreed should not be rushed

simonjayhawkins · 2021-01-18T14:12:23Z

FWIW I'm leaning to prefer keeping the breaking change from the consistency with Series argument and avoiding the code changes by users to avoid the warnings.

simonjayhawkins · 2021-01-18T14:29:08Z

pandas/core/arraylike.py

+        # if at least one is not aligned -> warn and fallback to array behaviour
+        if non_aligned:
+            warnings.warn(
+                "Calling a ufunc on non-aligned DataFrames/Series. Currently, the "


because the Series behavior is different, this warning could be misleading?

Hmm, yes. The most explicit is "non-aligned DataFrames or DataFrame/Series combination" or something like that, but wanted to keep it shorter ..
I agree the current can be misleading though (although you will of course never see the warning with only series)

simonjayhawkins · 2021-01-18T14:57:47Z

See #39184

this would close?

jorisvandenbossche · 2021-01-18T15:24:48Z

Yes, this closes that issue, will update the top post.

jorisvandenbossche · 2021-01-18T19:48:28Z

we are pretty much on top of the regressions from 1.2, so if a short delay enables to clear the list, it may be worth considering.

If delaying the release with 1 or 2 days helps getting this merged, I think that is worth it.

simonjayhawkins · 2021-01-18T19:55:50Z

Maybe add something in 1.2 release notes after Calling a binary-input NumPy ufunc on multiple DataFrame objects now aligns, matching the behavior of binary operations and ufuncs on Series (:issue:23743). like: This change has been reverted and deprecated instead in pandas 1.2.1, see :doc:v1.2.1 " or link direct to sub section

simonjayhawkins · 2021-01-18T20:06:40Z

If delaying the release with 1 or 2 days helps getting this merged, I think that is worth it.

i'll add the blocker tag here until there is consensus to re-open up the 1.2.1 milestone for new issues/PRs e.g. #39253 that need not block, but could potentially be completed before this.

jreback · 2021-01-19T13:59:52Z

doc/source/whatsnew/v1.2.1.rst

+
+.. code-block:: python
+
+    >>> df1 = pd.DataFrame({"a": [1, 2], "b": [3, 4]}, index=[0, 1])


this is an incorrect format

jreback · 2021-01-19T14:00:04Z

doc/source/whatsnew/v1.2.1.rst

+
+.. code-block:: python
+
+    >>> df1 + df2


make an actual ipython block

I need to use some plain code-blocks since part of the example is showing old behaviour (or behaviour that will change in the future), and so prefer to use then code-blocks for all examples, for consistency within this section

we use ipython blocks everywhere, pls do this

would like to change these to be consistent

doc/source/whatsnew/v1.2.1.rst

jreback · 2021-01-19T14:01:18Z

doc/source/whatsnew/v1.2.1.rst

+
+.. code-block:: python
+
+    >>> np.add(df1, np.asarray(df2))


use an actual ipython format

jreback · 2021-01-19T14:02:17Z

pandas/core/arraylike.py

    from pandas.core.generic import NDFrame
    from pandas.core.internals import BlockManager

    cls = type(self)

+    is_ndframe = [isinstance(x, NDFrame) for x in inputs]


why would you do this? simply check is_series. this is amazingly confusing.

What is is_series ?

we have dataframes and series

Yes, and NDFrame is the parent class for both? Do you want me to put isinstance(x, (Series, DataFrame)) instead of isinstance(x, NDFrame) ?

yes i think its more clear

Note that below in this array_ufunc function, we are also using NDFrame for this purpose

so rename this to is_series_or_frame i think is more clear

I renamed it now to n_alignable, because alignable is the variable name that is already used below, for consistency. And it also matches the explanation in the comment (which says this is Series or DataFrame).
(but can also rename to n_series_or_frame if you prefer)

pandas/core/arraylike.py

jreback · 2021-01-19T14:05:46Z

pandas/core/arraylike.py

+                "Calling a ufunc on non-aligned DataFrames (or DataFrame/Series "
+                "combination). Currently, the indices are ignored and the result "
+                "takes the index/columns of the first DataFrame. In the future "
+                "(pandas 2.0), the DataFrames/Series will be aligned before "


dont' need to mention the version

would not mention here

pandas/core/arraylike.py

…ment-deprecation

simonjayhawkins · 2021-01-19T15:25:54Z

@jorisvandenbossche did you see #39239 (comment)

(also since this PR is the blocker could maybe also update the release date in the notes to maybe save an extra ci/backport cycle)

jorisvandenbossche · 2021-01-19T15:34:55Z

@simonjayhawkins yeah, I saw that, thanks for the reminder, as I still need to do that.
I was maybe thinking to also add a subsection (or mainly the title + short summary) to the deprecations section, linking to the 1.2.1 page. Although the deprecation was not in 1.2.0 itself, people wanting to know what changed in 1.2.x in general will still mostly look at the 1.2.0 page.

(also since this PR is the blocker could maybe also update the release date in the notes to maybe save an extra ci/backport cycle)

Good idea. What's your current idea about the timeline? (eg try to merge this PR this evening, and start release process tomorrow morning? in which case I pick the date of tomorrow)

jorisvandenbossche · 2021-01-19T15:42:34Z

Actaully, we don't have subsections yet in the deprecations section in v1.2.0.rst, so just did the clarification of the original whatsnew note as you suggested @simonjayhawkins

simonjayhawkins · 2021-01-19T15:45:08Z

I normally like to start the release nearer to the start of the day, but could get going on the final pre-release checks #38721 (comment) as soon as this is backported.

I think the date should match the tag (and we discussed templating this #21050 (comment)) which may not match the github release if the release spans a couple of days.

If we're not sure, maybe best to leave out of this PR and I could expedite the change by not waiting for ci to complete (i normally wait for ci to complete which can add a couple of hours to the release process, for the change to master and again for the backport PR)

jreback · 2021-01-19T16:43:21Z

pandas/core/arraylike.py

+    """
+    Helper to check if a DataFrame is aligned with another DataFrame or Series.
+    """
+    from pandas.core.frame import DataFrame


might as well just import from pandas here, this is only the import if you can import at the top of the file (not sure if you can), also maybe can use ABCDataFrame

pandas.core.frame.py import from this file, so I don't think I can move the import to the top of the file

i get that you cannot put the import at the top. However when inside the function the style is to
from pandas import DataFrame

OK, changed the imports

pandas/core/arraylike.py

jreback · 2021-01-19T16:44:27Z

pandas/core/arraylike.py

+    is_ndframe = [isinstance(x, NDFrame) for x in inputs]
+    is_frame = [isinstance(x, DataFrame) for x in inputs]
+
+    if (sum(is_ndframe) >= 2) and (sum(is_frame) >= 1):


this condition is impossible to reason about. pls make it simpler. you just want to know if you have 2 or more dataframes right? (or series)? if so, just say that

No, I want to know if at least two alignable objects (DataFrame or Series) and at least one DataFrame, which is what the above line does, and which is what is explained on the line just below. I can try to clarify that comment if something is not clear about that?

try to simplify.

Sorry, Jeff, if you don't give me a clue about what exactly is unclear for you or about how you would do it differently, I have no idea how to improve this. The code reflects exactly what I just explained it needs checking, and it is explained in the line below as well.

Would eg change sum(is_frame) into a variable n_frames help? (and moving the sum to the list comprehension where now is_frame is defined)

well, the problem that this is getting so complicated that you need to comment. I honestly don't think this is worth doing this much change at this late hour.

if you want to do for 1.2.2 or better yet 1.3.ok

waiting for the nth change is extremely painful and disruptive.

these are supposed to be lightweight backports. this is turning in to a nightmare.

this is likely going to be extremely fragile and break again. and will then have to be patched again.

Waiting for 1.2.2 or 1.3 is not going to make this change any simpler, if you don't help me find out what you don't like about it

waiting for the nth change is extremely painful and disruptive.

What is this about?

these are supposed to be lightweight backports. this is turning in to a nightmare.

The changes in this PR is a rather clean additional check in the array_ufunc function, to use a different code path in certain cases. It almost doesn't touch any existing code, so I would say it is a clean patch to backport.

jreback

ok i suggested a couple of things to make it more clear. if you can fix the docs as suggested ok to merge.

doc/source/whatsnew/v1.2.0.rst

jreback · 2021-01-19T18:18:17Z

pandas/core/arraylike.py

+    """
+    Helper to check if a DataFrame is aligned with another DataFrame or Series.
+    """
+    from pandas.core.frame import DataFrame


i get that you cannot put the import at the top. However when inside the function the style is to
from pandas import DataFrame

jreback · 2021-01-19T18:18:48Z

pandas/core/arraylike.py

    from pandas.core.generic import NDFrame
    from pandas.core.internals import BlockManager

    cls = type(self)

+    is_ndframe = [isinstance(x, NDFrame) for x in inputs]


so rename this to is_series_or_frame i think is more clear

pandas/core/arraylike.py

…ment-deprecation

jreback

ok lgtm on code / tests. 2 doc comments.

doc/source/whatsnew/v1.2.0.rst

jreback · 2021-01-19T21:08:59Z

pandas/core/arraylike.py

+                "Calling a ufunc on non-aligned DataFrames (or DataFrame/Series "
+                "combination). Currently, the indices are ignored and the result "
+                "takes the index/columns of the first DataFrame. In the future "
+                "(pandas 2.0), the DataFrames/Series will be aligned before "


would not mention here

jreback · 2021-01-19T21:09:24Z

doc/source/whatsnew/v1.2.1.rst

+
+.. code-block:: python
+
+    >>> df1 + df2


would like to change these to be consistent

jorisvandenbossche · 2021-01-20T07:22:38Z

AFAIK the remaining comment is a doc comment on the whatsnew notes, and since this is a somewhat subjective style discussion / not critical IMO, I am going to take the liberty to merge this, so @simonjayhawkins can start the release process early in the day once this is backported and builds have passed.
(but, Jeff, I am happy to further discuss it and do a follow-up PR, we can still update those docs in the 1.2.x cycle and those will already be out in a few weeks)

@simonjayhawkins I also updated the date in the release notes here.

jorisvandenbossche · 2021-01-20T07:27:10Z

@meeseeksdev backport to 1.2.x

…y ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour

…n DataFrames if not aligned + fallback to <1.2.0 behaviour (#39288) Co-authored-by: Joris Van den Bossche <[email protected]>

jreback · 2021-01-20T09:42:33Z

@jorisvandenbossche pls following up and fix the docs

it's not a style issue rather this is completely inconsistent with the current docs

we NEVER use the style - always ipython docs style

…aligned + fallback to <1.2.0 behaviour (pandas-dev#39239)

DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not …

f5e9871

…aligned + fallback to <1.2.0 behaviour

jorisvandenbossche marked this pull request as draft January 17, 2021 20:15

jorisvandenbossche mentioned this pull request Jan 17, 2021

BUG: Numpy ufuncs e.g. np.[op](df1, df2) aligns columns in pandas 1.2.0 where it did not before #39184

Closed

3 tasks

jorisvandenbossche added 2 commits January 18, 2021 08:59

simplify / clean-up

e02392a

allow >2 inputs

c6f6898

jorisvandenbossche marked this pull request as ready for review January 18, 2021 08:19

add whatsnew

8700321

simonjayhawkins added this to the 1.2.1 milestone Jan 18, 2021

jorisvandenbossche added the Deprecate Functionality to remove in pandas label Jan 18, 2021

jorisvandenbossche mentioned this pull request Jan 18, 2021

RLS: 1.2.1 #38721

Closed

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

doc/source/whatsnew/v1.2.1.rst Outdated Show resolved Hide resolved

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

doc/source/whatsnew/v1.2.1.rst Outdated Show resolved Hide resolved

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

pandas/core/arraylike.py Outdated Show resolved Hide resolved

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

doc/source/whatsnew/v1.2.1.rst Outdated Show resolved Hide resolved

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

doc/source/whatsnew/v1.2.1.rst Outdated Show resolved Hide resolved

jreback requested changes Jan 18, 2021

View reviewed changes

update for feedback

1a6f257

simonjayhawkins reviewed Jan 18, 2021

View reviewed changes

clarify wording in warning

64b9430

jreback requested changes Jan 19, 2021

View reviewed changes

jorisvandenbossche added 4 commits January 19, 2021 15:35

refactor into separate helper function

3b66b14

fixup

4dcde0e

add link to original PR

20be3c7

Merge remote-tracking branch 'upstream/master' into array-ufunc-align…

097de71

…ment-deprecation

add note to v1.2.0 as well

f80b780

jreback reviewed Jan 19, 2021

View reviewed changes

jreback requested changes Jan 19, 2021

View reviewed changes

jorisvandenbossche added 3 commits January 19, 2021 19:28

Merge remote-tracking branch 'upstream/master' into array-ufunc-align…

dabd47f

…ment-deprecation

clean-up based on review

eaa83ed

add longer note in deprecation section of v1.2.0 docs

81e7c84

jreback requested changes Jan 19, 2021

View reviewed changes

jorisvandenbossche added 2 commits January 20, 2021 08:18

remove pandas 2.0 mention

4703410

update date release notes

5ed00bb

jorisvandenbossche merged commit ff628b1 into pandas-dev:master Jan 20, 2021

jorisvandenbossche deleted the array-ufunc-alignment-deprecation branch January 20, 2021 07:23

meeseeksmachine mentioned this pull request Jan 20, 2021

Backport PR #39239 on branch 1.2.x (DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour) #39288

Merged

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jan 20, 2021

Backport PR pandas-dev#39239: DEPR: raise deprecation warning in nump…

7791772

…y ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour

jorisvandenbossche added a commit that referenced this pull request Jan 20, 2021

Backport PR #39239: DEPR: raise deprecation warning in numpy ufuncs o…

69f4f96

…n DataFrames if not aligned + fallback to <1.2.0 behaviour (#39288) Co-authored-by: Joris Van den Bossche <[email protected]>

jreback mentioned this pull request Jan 20, 2021

DOC: fix the incorrect doc style in 1.2.1 #39290

Closed

nofarm3 pushed a commit to nofarm3/pandas that referenced this pull request Jan 21, 2021

DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not …

05d82ac

…aligned + fallback to <1.2.0 behaviour (pandas-dev#39239)

rhshadrach mentioned this pull request Dec 18, 2022

DEPR: log of deprecations in 1.x (to be removed in 2.0) #30228

Closed

mroeschke mentioned this pull request Dec 28, 2022

DEPR: Enforce alignment with numpy ufuncs #50455

Merged


		.. code-block:: python

		>>> df1 = pd.DataFrame({"a": [1, 2], "b": [3, 4]}, index=[0, 1])

DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour #39239

DEPR: raise deprecation warning in numpy ufuncs on DataFrames if not aligned + fallback to <1.2.0 behaviour #39239

Conversation

jorisvandenbossche commented Jan 17, 2021 • edited Loading

jreback left a comment

Choose a reason for hiding this comment

jorisvandenbossche commented Jan 18, 2021

jreback left a comment

Choose a reason for hiding this comment

jreback commented Jan 18, 2021

jorisvandenbossche commented Jan 18, 2021

simonjayhawkins commented Jan 18, 2021

simonjayhawkins commented Jan 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins commented Jan 18, 2021

jorisvandenbossche commented Jan 18, 2021

jorisvandenbossche commented Jan 18, 2021

simonjayhawkins commented Jan 18, 2021

simonjayhawkins commented Jan 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins commented Jan 19, 2021

jorisvandenbossche commented Jan 19, 2021 • edited Loading

jorisvandenbossche commented Jan 19, 2021

simonjayhawkins commented Jan 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Jan 20, 2021

jorisvandenbossche commented Jan 20, 2021

jreback commented Jan 20, 2021

jorisvandenbossche commented Jan 17, 2021 •

edited

Loading

jorisvandenbossche commented Jan 19, 2021 •

edited

Loading

simonjayhawkins commented Jan 19, 2021 •

edited

Loading