How to identify numpy types in python?
Asked Answered
F

6

147

How can one reliably determine if an object has a numpy type?

I realize that this question goes against the philosophy of duck typing, but idea is to make sure a function (which uses scipy and numpy) never returns a numpy type unless it is called with a numpy type. This comes up in the solution to another question, but I think the general problem of determining if an object has a numpy type is far enough away from that original question that they should be separated.

Feinberg answered 24/9, 2012 at 16:49 Comment(2)
One question: If you (or, say, scipy) define a type that subclasses a numpy type, should that count or not? (I believe you can't subclass numpy types in Python, but you can in a C module, and I think you can also subclass numpypy types in PyPy… so it probably doesn't matter, but it's not inconceivable that it could.)Muley
I hadn't thought of that; basically your comment points out that the question is more difficult than expected. Honestly that kind of high-level consideration is way overkill for my situation. For the general and portable answer, I would say that as long as the behaviour is defined then it's OK.Feinberg
M
154

Use the builtin type function to get the type, then you can use the __module__ property to find out where it was defined:

>>> import numpy as np
a = np.array([1, 2, 3])
>>> type(a)
<type 'numpy.ndarray'>
>>> type(a).__module__
'numpy'
>>> type(a).__module__ == np.__name__
True
Muley answered 24/9, 2012 at 17:36 Comment(4)
is e.g. numpy.ma.MaskedArray not a numpy enough type?Prevaricator
If you want anything in numpy.* you just walk the parent package of the module. (At that point, you obviously want to wrap it in a function.) And if you want pandas DataFrames to count as numpyish, add an or to test for that. And so on. The point is, you have to know what you're actually asking for when you want to do something as unusual as loose manual type switching, but once you know, it's easy to implement.Muley
This solution seems very unpythonic, relying on hidden attributes. But maybe that is just a matter of taste?Orphaorphan
@Orphaorphan They’re not hidden attributes, they’re documented special attributes. It is, nevertheless, unpythonic, but I think that’s inherent in the problem. (And I think it’s a strength of Python that when you want to do something the language discourages, the best solution is usually visibly ugly enough to call out that you’re doing something that’s normally a bad idea.)Muley
F
124

The solution I've come up with is:

isinstance(y, (np.ndarray, np.generic) )

However, it's not 100% clear that all numpy types are guaranteed to be either np.ndarray or np.generic, and this probably isn't version robust.

Feinberg answered 24/9, 2012 at 16:49 Comment(3)
I suppose you could filter dir(numpy) on types and builtin functions (and classes, but I don't think it has any) and use that to generate a tuple to isinstance against, which would be robust. (I believe you can pass builtin functions to isinstance whether they're actually type constructors or not, but you'd have to check that.)Muley
Yes, they should all be subclasses of those two AFAIK.Drier
@Drier Thanks. It certainly seems to be the case for now, but the python documentation isn't very clear on this and it could conceivably change in the future.Feinberg
C
29

Old question but I came up with a definitive answer with an example. Can't hurt to keep questions fresh as I had this same problem and didn't find a clear answer. The key is to make sure you have numpy imported, and then run the isinstance bool. While this may seem simple, if you are doing some computations across different data types, this small check can serve as a quick test before your start some numpy vectorized operation.

##################
# important part!
##################

import numpy as np

####################
# toy array for demo
####################

arr = np.asarray(range(1,100,2))

########################
# The instance check
######################## 

isinstance(arr,np.ndarray)
Cytoplasm answered 16/10, 2016 at 2:30 Comment(1)
Why do you say this is definitive? This is the same answer as https://mcmap.net/q/157971/-how-to-identify-numpy-types-in-python, which was written 4 years before this answer, where the author points to documentation indicating that it isn't definitive.Perished
K
12

That actually depends on what you're looking for.

  • If you want to test whether a sequence is actually a ndarray, a isinstance(..., np.ndarray) is probably the easiest. Make sure you don't reload numpy in the background as the module may be different, but otherwise, you should be OK. MaskedArrays, matrix, recarray are all subclasses of ndarray, so you should be set.
  • If you want to test whether a scalar is a numpy scalar, things get a bit more complicated. You could check whether it has a shape and a dtype attribute. You can compare its dtype to the basic dtypes, whose list you can find in np.core.numerictypes.genericTypeRank. Note that the elements of this list are strings, so you'd have to do a tested.dtype is np.dtype(an_element_of_the_list)...
Kress answered 24/9, 2012 at 18:20 Comment(1)
+1. If you're actually looking for something besides "is a numpy type", and can define what that something is, this is better than the other answers. And in most cases, you should be looking for something specific that you can define.Muley
A
8

To get the type, use the builtin type function. With the in operator, you can test if the type is a numpy type by checking if it contains the string numpy;

In [1]: import numpy as np

In [2]: a = np.array([1, 2, 3])

In [3]: type(a)
Out[3]: <type 'numpy.ndarray'>

In [4]: 'numpy' in str(type(a))
Out[4]: True

(This example was run in IPython, by the way. Very handy for interactive use and quick tests.)

Autonomous answered 24/9, 2012 at 17:4 Comment(5)
This works, but if you define a type called, say, "numpygroup", you'll get false positives. Also, depending on the string representation of types is a bad idea if you can avoid it—and in this case, you can. Look at its module instead.Muley
Using the module is indeed a better solution.Autonomous
Regex could be usedHonorarium
@Omkaar.K Regex could be used for what? To do the exact same check in a slightly more complicated way?Muley
@abamert "Could" is what I said, also regex could look complicated for simple tasks like these, but it is extremely useful for large string processing tasks, So it is not a bad idea to learn it. I guess you know that already since your portfolio portrays you as a senior programmer?Honorarium
P
7

Note that the type(numpy.ndarray) is a type itself and watch out for boolean and scalar types. Don't be too discouraged if it's not intuitive or easy, it's a pain at first.

See also: - https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.dtypes.html - https://github.com/machinalis/mypy-data/tree/master/numpy-mypy

>>> import numpy as np
>>> np.ndarray
<class 'numpy.ndarray'>
>>> type(np.ndarray)
<class 'type'>
>>> a = np.linspace(1,25)
>>> type(a)
<class 'numpy.ndarray'>
>>> type(a) == type(np.ndarray)
False
>>> type(a) == np.ndarray
True
>>> isinstance(a, np.ndarray)
True

Fun with booleans:

>>> b = a.astype('int32') == 11
>>> b[0]
False
>>> isinstance(b[0], bool)
False
>>> isinstance(b[0], np.bool)
False
>>> isinstance(b[0], np.bool_)
True
>>> isinstance(b[0], np.bool8)
True
>>> b[0].dtype == np.bool
True
>>> b[0].dtype == bool  # python equivalent
True

More fun with scalar types, see: - https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.scalars.html#arrays-scalars-built-in

>>> x = np.array([1,], dtype=np.uint64)
>>> x[0].dtype
dtype('uint64')
>>> isinstance(x[0], np.uint64)
True
>>> isinstance(x[0], np.integer)
True  # generic integer
>>> isinstance(x[0], int)
False  # but not a python int in this case

# Try matching the `kind` strings, e.g.
>>> np.dtype('bool').kind                                                                                           
'b'
>>> np.dtype('int64').kind                                                                                          
'i'
>>> np.dtype('float').kind                                                                                          
'f'
>>> np.dtype('half').kind                                                                                           
'f'

# But be weary of matching dtypes
>>> np.integer
<class 'numpy.integer'>
>>> np.dtype(np.integer)
dtype('int64')
>>> x[0].dtype == np.dtype(np.integer)
False

# Down these paths there be dragons:

# the .dtype attribute returns a kind of dtype, not a specific dtype
>>> isinstance(x[0].dtype, np.dtype)
True
>>> isinstance(x[0].dtype, np.uint64)
False  
>>> isinstance(x[0].dtype, np.dtype(np.uint64))
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: isinstance() arg 2 must be a type or tuple of types
# yea, don't go there
>>> isinstance(x[0].dtype, np.int_)
False  # again, confusing the .dtype with a specific dtype


# Inequalities can be tricky, although they might
# work sometimes, try to avoid these idioms:

>>> x[0].dtype <= np.dtype(np.uint64)
True
>>> x[0].dtype <= np.dtype(np.float)
True
>>> x[0].dtype <= np.dtype(np.half)
False  # just when things were going well
>>> x[0].dtype <= np.dtype(np.float16)
False  # oh boy
>>> x[0].dtype == np.int
False  # ya, no luck here either
>>> x[0].dtype == np.int_
False  # or here
>>> x[0].dtype == np.uint64
True  # have to end on a good note!
Potboiler answered 1/2, 2019 at 2:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.