How can I tell if NumPy creates a view or a copy?
Asked Answered
M

2

88

For a minimal working example, let's digitize a 2D array. numpy.digitize requires a 1D array:

import numpy as np
N = 200
A = np.random.random((N, N))
X = np.linspace(0, 1, 20)
print np.digitize(A.ravel(), X).reshape((N, N))

Now the documentation says:

... A copy is made only if needed.

How do I know if the ravel copy it is "needed" in this case? In general - is there a way I can determine if a particular operation creates a copy or a view?

Malignant answered 17/7, 2012 at 14:26 Comment(1)
If you'll want to force copy, the best thing I found is to use np.copy, or np.array like tr = np.array(a.T, copy=True) Rancourt
A
93

This question is very similar to a question that I asked a while back:

You can check the base attribute.

a = np.arange(50)
b = a.reshape((5, 10))
print (b.base is a)

However, that's not perfect. You can also check to see if they share memory using np.may_share_memory.

print (np.may_share_memory(a, b))

There's also the flags attribute that you can check:

print (b.flags['OWNDATA'])  #False -- apparently this is a view
e = np.ravel(b[:, 2])
print (e.flags['OWNDATA'])  #True -- Apparently this is a new numpy object.

But this last one seems a little fishy to me, although I can't quite put my finger on why...

Adenocarcinoma answered 17/7, 2012 at 14:30 Comment(5)
Interesting, thanks for the links to your answers. I'll leave my question up as the wording "view" versus "share" is different (and didn't come up in a search).Malignant
@Malignant -- Yeah, this is just barely different enough for me to answer your question instead of mark it as a duplicate (I don't know what others will think). Anyway, hopefully this is helpful.Adenocarcinoma
I was just trying some of these out and using flags['OWNDATA'] can definitely fail in some cases. In your example, if you use e = np.reshape(b[:, 2], -1) instead of ravel, flags['OWNDATA'] will be False, even though a copy was made.Clishmaclaver
@amicitas, that's because e is in fact a view on e.base which itself is the actual copy of the array produced by the reshape operation. See further here.Tinderbox
@Adenocarcinoma In which way is the first solution not perfect?Isopropyl
C
23

In the documentation for reshape there is some information about how to ensure an exception if a view cannot be made:

It is not always possible to change the shape of an array without copying the data. If you want an error to be raised if the data is copied, you should assign the new shape to the shape attribute of the array:

>>> a = np.zeros((10, 2))
# A transpose make the array non-contiguous
>>> b = a.T
# Taking a view makes it possible to modify the shape without modiying the
# initial object.
>>> c = b.view()
>>> c.shape = (20)
AttributeError: incompatible shape for a non-contiguous array



This is not exactly an answer to your question, but in certain cases it may be just as useful.

Clishmaclaver answered 11/1, 2013 at 3:40 Comment(2)
I'm still confused as to why this works. Why should I be able to apply a shape to a non-view?Wynny
This is really telling you if the data is contiguous, not if it is a view. While it's true that non-contiguous data can only be a view, the converse is not true. For example, a = np.zeros((10, 4)); b = a[0]; b.shape = (2, 2) works just fine, because b is a (C) contiguous view of a (which you can already see from b.flags['C_CONTIGUOUS']). Also it should be noted that c = b.view() does nothing in this example. a.T is already a view of a.Combust

© 2022 - 2024 — McMap. All rights reserved.