-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Remove deepcopies when slicing cubes and copying coords #2261
Conversation
Happy to squash when your happy @marqh |
2a2b212
to
f775801
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to come in on this party so late..
I've just reviewed our deprecation "rules", as set out in the change management whitepaper.
Key bits I think are relevant :
- "A deprecation is issued when we decide that an existing feature needs to be removed or modified :
We add notices to the documentation, and issue a runtime "Deprecation Warning" whenever the feature is used." - "sometimes we really need to change the way an API works, without modifying or extending (i.e. complicating) the existing user interface" ...
- ... "For changes of this sort, the release will define a new boolean property of the iris.FUTURE object"
- "the new option defaults to iris.FUTURE.<new_enable>=False, meaning the \u2018old\u2019 behaviour is the default" ...
- ... "when any relevant API call is made that invokes the old behaviour, a deprecation warning is emitted."
So, according to those definitions, I think ...
- this is a deprecation of a feature, and a FUTURE flag is the right way to approach it
- it ought to issue a DeprecationWarning when the old behaviour is used.
I also think a couple of the additions to existing docstrings are a bit confusing
- they must not suggest that we are deprecating the actual routines.
- they should explain what the old_new behaviours look like (or refer to somewhere that does)
- they must make it clear that the current default behaviour is unchanged, but users should now be enabling the new one
lib/iris/__init__.py
Outdated
controls whether `Coord.copy()` defaults to creating coordinates | ||
whose `points` and `bounds` attributes are views onto the | ||
original coordinate's attributes. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth adding a sentence explaining the alternative. Something like ...
"Conversely, when share_data=False (currently the default), an indexed partial cube or copied coord always contains fully independent, copied data arrays."
lib/iris/coords.py
Outdated
@@ -516,13 +516,26 @@ def copy(self, points=None, bounds=None): | |||
.. note:: If the points argument is specified and bounds are not, the | |||
resulting coordinate will have no bounds. | |||
|
|||
.. deprecated:: 1.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be clearer : It reads like you are proposing to remove the coord.copy function altogether (or some keywords)
-- I assume not ?!?
lib/iris/cube.py
Outdated
@@ -2153,6 +2153,14 @@ def __getitem__(self, keys): | |||
requested must be applicable directly to the cube.data attribute. All | |||
metadata will be subsequently indexed appropriately. | |||
|
|||
.. deprecated:: 1.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, what we are actually deprecating is only an aspect of behaviour, so this still doesn't read quite right.
I'd suggest something like :
.. deprecated:: 1.12
In future, the `data` attribute of the indexing result may be a view onto the original data array,
to avoid unnecessary copying. For the present, however, indexing always produces a full
independent copy.
The `share_data` attribute of `iris.FUTURE` should be set to True to enable this new
data-sharing behaviour (if not, a deprecation warning will be issued).
Also, awkwardly, at present the Sphinx docs pass all this one by anyway, as we aren't processing docstrings for special functions.
I think you can fix that by explicitly generating docs for specific routines, and maybe we should do that with this one, but I'm not immediately sure how.
@@ -0,0 +1,3 @@ | |||
* Deprecated the data-copying behaviour of Cube indexing and `Coord.copy()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a bit more explanation I think, as everyone is going to be affected.
For the whatsnew, I think that could be just a link to a better explanation.
lib/iris/coords.py
Outdated
@@ -516,13 +516,26 @@ def copy(self, points=None, bounds=None): | |||
.. note:: If the points argument is specified and bounds are not, the | |||
resulting coordinate will have no bounds. | |||
|
|||
.. deprecated:: 1.12 | |||
|
|||
By default the new coordinate's `points` and `bounds` will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this explains it from slightly the wrong angle : see the Cube.__getitem__
comment...
lib/iris/coords.py
Outdated
new_coord.attributes = copy.deepcopy(self.attributes) | ||
new_coord.coord_system = copy.deepcopy(self.coord_system) | ||
else: | ||
new_coord = copy.deepcopy(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This branch should issue a deprecation warning.
if isinstance(data, biggus.Array) or not data.flags['OWNDATA']: | ||
data = copy.deepcopy(data) | ||
if not iris.FUTURE.share_data: | ||
# We don't want a view of the data, so take a copy of it if it's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This codepath should issue a deprecation warning.
lib/iris/coords.py
Outdated
@@ -516,13 +516,26 @@ def copy(self, points=None, bounds=None): | |||
.. note:: If the points argument is specified and bounds are not, the | |||
resulting coordinate will have no bounds. | |||
|
|||
.. deprecated:: 1.12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a completely seperate issue from deprecations etc...
I'm a bit bothered about the scope of the new behaviour in coordinates.
Firstly, I'm not convinced that it is expected for a method called "copy" to return a view at all.
I'm thinking that it would make more sense to have .copy()
return a full copy, and use [:]
for a toplevel-copy.
Isn't that what numpy does, so that is what we should emulate ?
E.G.
>>> a=np.arange(10)
>>> b=a[2:4]
>>> b[:] = 0
>>> a
array([0, 1, 0, 0, 4, 5, 6, 7, 8, 9])
>>> b=a.copy()
>>> b[:] = -1
>>> a
array([0, 1, 0, 0, 4, 5, 6, 7, 8, 9])
>>>
Secondly, as I've just mentioned it, shouldn't Coord.__getitem__
be returning copies ??
(that is, instead, if you accept the previous...)
Thanks @pp-mo ah your onto something here, I think I agree. The subtlety is that |
6410d9e
to
2810779
Compare
Hi @pp-mo, I think I have found a more reasonable behaviour for this views/copy business for the coords object (see the latest commit). Summary:
I haven't updated the deprecation issues you mentioned yet (but I will do). Let me know if your happy with the underlying chosen behaviour. Cheers |
I'm a bit confused now as to what we are (and are not) doing with coords...
I'm also wondering whether you need to apply these tests to a DimCoord as well ? |
Meanwhile...
I think it passes the tests, anyway. |
@cpelley uncovered some unexpected behaviour (as discussed offline):
This only applies to an AuxCoord, as the DimCoord value arrays are not writeable. For the present case, we might consider introducing this type of behaviour, i.e. a sliced coord is linked to the original. We might also consider extending that to DimCoords. |
NOTE: |
Another NOTE: |
Thanks @pp-mo, I'll update the ticket description. There are so many subtleties that I think it worth going in some detail in the description for the record
Thanks @pp-mo, yes looks neater and simpler, thanks. |
You will be pleased to know that I have found some further subtleties. Some I have fixed and others I'm working on.
Gosh! |
@pp-mo I think I'm ready. The travis failure is unrelated (I have restarted the tests but to no avail).
|
Hi @cpelley your patience is appreciated, I hope you are still with us. We have been having trouble with the above test failure in processing the preparation for the 1.12 release. There is now a fix on the 1.12.x branch, which hasn't been merged back into master yet. I'd like this functionality in 1.12 and to see the tests passing; so, please rebase and retarget (in a new PR is fine if that is easier) and we'll do our best to get this in thank you |
9abd578
to
e1add28
Compare
Thanks @marqh, I have changed base and squashed. |
e1add28
to
bcc0100
Compare
Test failures are due to unrelated (files untouched) license header failures (problems with 1.12.x). |
bcc0100
to
20b0abe
Compare
Rebased but I'm still getting what looks like unrelated problems...
|
ping |
20b0abe
to
ca0f7dd
Compare
Hi @marqh, I have rebased again and have the same error (looks unrelated to me??) |
- When indexing a cube when the data is reaslised (not lazy), a view of the original array is returned where possible (subject to the rules when slicing in numpy). - When indexing a cube when the data is not reaslised (lazy), realising the data on one will still not realise the data on the other. - Existing behaviour is that slicing coordinates returns views of the original points and bounds (where possible). This was likely chosen behaviour on the basis that DimCoords at least are not writeable. This is not the same however for Auxiliary coordinates and likely raises the likely case for this being a bug (i.e. one can modify AuxCoord object points and bounds). - DimCoord slicing will now return views of its data like AuxCoords. DimCoords will continue to realise data unlike AuxCoords due to the validation necessary for being monotonically increasing.
ca0f7dd
to
00697e8
Compare
87cb1c9
to
a35908b
Compare
Urgh no end of trouble with travis on this one... TimedOutException on the Python3.4 TARGET=DEFAULT tests :( There are genuine test failures though: The example tests (i.e. TARGET=example) are failing because the warning about this deprecation is issued. Before I look at this, do we really want the example tests failing just because deprecation warnings are issued?? |
this is an explicit choice, and one that I support I do not want 'example' code using deprecated features |
Closed in favour of #2549 |
Summary:
Replaces #1992