Type hinting / annotation (PEP 484) for numpy.ndarray
Asked Answered
G

5

205

Has anyone implemented type hinting for the specific numpy.ndarray class?

Right now, I'm using typing.Any, but it would be nice to have something more specific.

For instance if the NumPy people added a type alias for their array_like object class. Better yet, implement support at the dtype level, so that other objects would be supported, as well as ufunc.

Gotama answered 27/2, 2016 at 18:44 Comment(8)
I don't recall seeing any use of Python3 type annotation in SO numpy questions or answers.Sanbenito
pypi.python.org/pypi/plac can make use of Py3 annotations - to populate an argparse parser. For Py2, it uses decorators to create a similar annocation database.Sanbenito
typing is new to Py 3.5. Many numpy users still work with Py2. I have 3.5 on my system, but I don't have numpy installed for it. numpy developers are not going to add features for the cutting edge of Python (with the exception of the @ operator)Sanbenito
@hpaulj, can you cite your source for the last comment? I'm not sure where I should go to interact with the Numpy maintainers... it could very well be that integrating other 'advanced' Python features would be popular.Gotama
numpy is maintained on a github repository. Look at the issues and pull requests; sign up and submit your own issue. There may be another forum for discussing development issues, but most I look at the github issues.Sanbenito
For anyone looking into the issue - it looks like there's a relevant solution here: #52839927Protoplast
There is now an open issue in the numpy github repository regarding type hinting / annotation for numpy types.Hurwitz
> There is now... @Hurwitz this ticket was opened by me, the OP, 4.5 years ago.Gotama
S
72

Update

Check recent numpy versions for a new typing module

https://numpy.org/doc/stable/reference/typing.html#module-numpy.typing

dated answer

It looks like typing module was developed at:

https://github.com/python/typing

The main numpy repository is at

https://github.com/numpy/numpy

Python bugs and commits can be tracked at

http://bugs.python.org/

The usual way of adding a feature is to fork the main repository, develop the feature till it is bomb proof, and then submit a pull request. Obviously at various points in the process you want feedback from other developers. If you can't do the development yourself, then you have to convince someone else that it is a worthwhile project.

cython has a form of annotations, which it uses to generate efficient C code.


You referenced the array-like paragraph in numpy documentation. Note its typing information:

A simple way to find out if the object can be converted to a numpy array using array() is simply to try it interactively and see if it works! (The Python Way).

In other words the numpy developers refuse to be pinned down. They don't, or can't, describe in words what kinds of objects can or cannot be converted to np.ndarray.

In [586]: np.array({'test':1})   # a dictionary
Out[586]: array({'test': 1}, dtype=object)

In [587]: np.array(['one','two'])  # a list
Out[587]: 
array(['one', 'two'], 
      dtype='<U3')

In [589]: np.array({'one','two'})  # a set
Out[589]: array({'one', 'two'}, dtype=object)

For your own functions, an annotation like

def foo(x: np.ndarray) -> np.ndarray:

works. Of course if your function ends up calling some numpy function that passes its argument through asanyarray (as many do), such an annotation would be incomplete, since your input could be a list, or np.matrix, etc.


When evaluating this question and answer, pay attention to the date. 484 was a relatively new PEP back then, and code to make use of it for standard Python still in development. But it looks like the links provided are still valid.

Sanbenito answered 28/2, 2016 at 20:45 Comment(6)
What software, editor or interpreter are you using that makes use of annotations? As best I know, in plain Python 3, a function gets a __annotations__ dictionary, but the interpreter does nothing with it.Sanbenito
Do you want typing annotations added to existing numpy functions (including np.array), or just types that would make it easier to add annotations to your own functions?Sanbenito
I've marked this answer as the accepted one, but just for completeness, I was going for the latter (type hinting in my own code, which uses Numpy). I'm all for Duck Typing, but when you can provide static type information, I don't see why you wouldn't, if only for static code analysis (PyCharm does warn about incompatible types). Thanks, @hpaulj!Gotama
Since, typing module simply provides hints, I have created two helper labels purely for readability and note it doesn't pass mypy static type checks. def Vector(np_arr): return np_arr.ndim == 1 def Matrix(np_arr): return np_arr.ndim > 1 . Hope, it helps someone.Zorazorah
What about the shape? I can add hints like def blah() -> np.ndarray(785): But I can't can't add a second dimension like -> np.ndarray(785, 10). Having a shape hint is very helpful and brings clarity to multiple functions in my code that produce arrays of varying dimensionality.Strophanthus
@Strophanthus support for shapes is work-in-progress in python (peps.python.org/pep-0646) as well as numpy (github.com/numpy/numpy/issues/16544)Cantone
H
88

Numpy 1.21 includes a numpy.typing module with an NDArray generic type.


From the Numpy 1.21 docs:
numpy.typing.NDArray = numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]]

A generic version of np.ndarray[Any, np.dtype[+ScalarType]].

Can be used during runtime for typing arrays with a given dtype and unspecified shape.

Examples:

>>> import numpy as np
>>> import numpy.typing as npt

>>> print(npt.NDArray)
numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]]

>>> print(npt.NDArray[np.float64])
numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]

>>> NDArrayInt = npt.NDArray[np.int_]
>>> a: NDArrayInt = np.arange(10)

>>> def func(a: npt.ArrayLike) -> npt.NDArray[Any]:
...     return np.array(a)

As of 2022-09-05, support for shapes is still a work in progress per numpy/numpy#16544.

Hurwitz answered 25/6, 2021 at 13:44 Comment(9)
I am just wondering what if I use ndarray rather than NDArray in type hinting ? Is there any fundamental difference ?Beverleybeverlie
Looking at the definition of NDarray, it seems that (1) there is a difference at runtime (as NDArray is a generic alias while ndarray is a class), and (2) at type-checking time (e.g. when using mypy, pyright, etc) there should by no difference between NDarray[Foo] and np.ndarray[Any, np.dtype[Foo]].Hurwitz
Is there a way to define the number of dimensions (Vector, Matrix, 3 dimensions, etc...)?Divinadivination
No, Numpy does not support that currently. There are long term efforts to support such shape hints in the python ecosystem, e.g. PEP 646 was recently introduced in python3.11. I suspect that Numpy will likely move towards eventual support of shape hints / number-of-dimension hints via PEP 646, but it will likely take a long time to implement and roll out. In the meantime, there are 3rd party libraries such as nptyping that provide type hints for Numpy with support for shape hints.Hurwitz
What's the first argument to the np.ndarray[...] type hint and why is it Any in all the examples?Malmo
I believe the first argument is intended for future use as a "shape annotation," where you'd hint to the type checker about what dimensions you expect the array to have. The examples use Any because numpy has not yet implemented support for such annotations.Hurwitz
Not sure if this question is covered above already, but it's not clear to me: does it matter whether I use npt.NDArray or np.ndarray` as a type hint. If so, when should I use which?Immature
@Immature For humans and static type checkers reading the code, the type hint arr: NDArray[np.float64] gives more information than does arr: ndarray. This is potentially advantageous.Hurwitz
np.ndarray[Any, np.dtype[+ScalarType]] is unclear to me - I don't know what the + stands for there, nor ScalarType for that matter.Airman
S
72

Update

Check recent numpy versions for a new typing module

https://numpy.org/doc/stable/reference/typing.html#module-numpy.typing

dated answer

It looks like typing module was developed at:

https://github.com/python/typing

The main numpy repository is at

https://github.com/numpy/numpy

Python bugs and commits can be tracked at

http://bugs.python.org/

The usual way of adding a feature is to fork the main repository, develop the feature till it is bomb proof, and then submit a pull request. Obviously at various points in the process you want feedback from other developers. If you can't do the development yourself, then you have to convince someone else that it is a worthwhile project.

cython has a form of annotations, which it uses to generate efficient C code.


You referenced the array-like paragraph in numpy documentation. Note its typing information:

A simple way to find out if the object can be converted to a numpy array using array() is simply to try it interactively and see if it works! (The Python Way).

In other words the numpy developers refuse to be pinned down. They don't, or can't, describe in words what kinds of objects can or cannot be converted to np.ndarray.

In [586]: np.array({'test':1})   # a dictionary
Out[586]: array({'test': 1}, dtype=object)

In [587]: np.array(['one','two'])  # a list
Out[587]: 
array(['one', 'two'], 
      dtype='<U3')

In [589]: np.array({'one','two'})  # a set
Out[589]: array({'one', 'two'}, dtype=object)

For your own functions, an annotation like

def foo(x: np.ndarray) -> np.ndarray:

works. Of course if your function ends up calling some numpy function that passes its argument through asanyarray (as many do), such an annotation would be incomplete, since your input could be a list, or np.matrix, etc.


When evaluating this question and answer, pay attention to the date. 484 was a relatively new PEP back then, and code to make use of it for standard Python still in development. But it looks like the links provided are still valid.

Sanbenito answered 28/2, 2016 at 20:45 Comment(6)
What software, editor or interpreter are you using that makes use of annotations? As best I know, in plain Python 3, a function gets a __annotations__ dictionary, but the interpreter does nothing with it.Sanbenito
Do you want typing annotations added to existing numpy functions (including np.array), or just types that would make it easier to add annotations to your own functions?Sanbenito
I've marked this answer as the accepted one, but just for completeness, I was going for the latter (type hinting in my own code, which uses Numpy). I'm all for Duck Typing, but when you can provide static type information, I don't see why you wouldn't, if only for static code analysis (PyCharm does warn about incompatible types). Thanks, @hpaulj!Gotama
Since, typing module simply provides hints, I have created two helper labels purely for readability and note it doesn't pass mypy static type checks. def Vector(np_arr): return np_arr.ndim == 1 def Matrix(np_arr): return np_arr.ndim > 1 . Hope, it helps someone.Zorazorah
What about the shape? I can add hints like def blah() -> np.ndarray(785): But I can't can't add a second dimension like -> np.ndarray(785, 10). Having a shape hint is very helpful and brings clarity to multiple functions in my code that produce arrays of varying dimensionality.Strophanthus
@Strophanthus support for shapes is work-in-progress in python (peps.python.org/pep-0646) as well as numpy (github.com/numpy/numpy/issues/16544)Cantone
P
33

At my company we've been using:

from typing import TypeVar, Generic, Tuple, Union, Optional
import numpy as np

Shape = TypeVar("Shape")
DType = TypeVar("DType")

class Array(np.ndarray, Generic[Shape, DType]):
    """  
    Use this to type-annotate numpy arrays, e.g. 
        image: Array['H,W,3', np.uint8]
        xy_points: Array['N,2', float]
        nd_mask: Array['...', bool]
    """
    pass

def compute_l2_norm(arr: Array['N,2', float]) -> Array['N', float]:
    return (arr**2).sum(axis=1)**.5

print(compute_l2_norm(arr = np.array([(1, 2), (3, 1.5), (0, 5.5)])))

We actually have a MyPy checker around this that checks that the shapes work out (which we should release at some point). Only thing is it doesn't make PyCharm happy (ie you still get the nasty warning lines):

enter image description here

Pontus answered 23/9, 2020 at 16:44 Comment(6)
any updates on the MyPy checker? would love to integrate it to my envReddy
This is good stuff, thanks for sharing. It seems, however, that the nptyping package (github.com/ramonhagenaars/nptyping) considerably generalizes this.Vivid
I still find myself using this version over nptyping for documentation purposes, because I find aArray['2,2',int] easier to type than NDArray[Shape["2, 2"], Int], and you can give meaning via what you name dimensions, e.g. BGRImageArray = Array['H,W,3', 'uint8'] makes it clear that the first dimension is height. That said, if you actually intend to use mypy for type checking, definitely go for nptyping.Pontus
This sort of reinvents the wheel (albeit not a very good one), with numpy.typing being a thing.Asuncion
@Vivid Puzzlingly, nptyping doesn't currently seem to allow Mypy to check shape mismatches, so in that sense it is worse than the solution in this answer if you don't need support for all that extra stuff like recarrays...Hydroquinone
@Asuncion numpy.typing doesn't currently support shape annotations/checking. This solution does.Hydroquinone
S
11

nptyping adds lots of flexibility for specifying numpy type hints.

Screw answered 16/9, 2020 at 17:40 Comment(1)
nptyping is a life changer.. it really solved my problems when typing numpy arrays. It works great with unittests when one needs to verify the instance type, shape, etc. I highly recommend using it!Calcic
F
-1

What i did was to just define it as

Dict[Tuple[int, int], TYPE]

So for example if you want an array of floats you can do:

a = numpy.empty(shape=[2, 2], dtype=float) # type: Dict[Tuple[int, int], float]

This is of course not exact from a documentation perspective, but for analyzing correct usage and getting proper completion with pyCharm it works great!

Fortunio answered 18/10, 2016 at 19:39 Comment(3)
this is worse than using np.ndarray as a typeTutu
@JulesG.M., may I know what's the difference of using np.array and NDArray as a type? if you have a quick answer.Beverleybeverlie
this is an old comment, from before NDArray came to exist @Jiadong. NDArray is better now because it has tools to also indicate the dtype of the arrayTutu

© 2022 - 2024 — McMap. All rights reserved.