Reconcile np.fromiter and multidimensional arrays in Python
Asked Answered
O

2

15

I love using np.fromiter from numpy because it is a resource-lazy way to build np.array objects. However, it seems like it doesn't support multidimensional arrays, which are quite useful as well.

import numpy as np

def fun(i):
    """ A function returning 4 values of the same type.
    """
    return tuple(4*i + j for j in range(4))

# Trying to create a 2-dimensional array from it:
a = np.fromiter((fun(i) for i in range(5)), '4i', 5) # fails

# This function only seems to work for 1D array, trying then:
a = np.fromiter((fun(i) for i in range(5)),
        [('', 'i'), ('', 'i'), ('', 'i'), ('', 'i')], 5) # painful

# .. `a` now looks like a 2D array but it is not:
a.transpose() # doesn't work as expected
a[0, 1] # too many indices (of course)
a[:, 1] # don't even think about it

How can I get a to be a multidimensional array while keeping such a lazy construction based on generators?

Orchidectomy answered 1/12, 2015 at 10:47 Comment(0)
D
1

Short update on the question: with NumPy=1.23 it is now possible to do exactly what is given in the example:

import numpy as np

def fun(i):
    """A function returning 4 values of the same type."""
    return tuple(4*i + j for j in range(4))

# Trying to create a 2-dimensional array from it:
a = np.fromiter((fun(i) for i in range(5)), dtype='4i', count=5)
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15],
#        [16, 17, 18, 19]], dtype=int32)

Personally, I find it more readable to pass the datatypes directly instead of using the strings (not that 'i' results in int32 and not the standard int64):

a = np.fromiter((fun(i) for i in range(5)), dtype=np.dtype((int, 4)), count=5)
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15],
#        [16, 17, 18, 19]])

See also the documentation of fromiter which contains a similar example.

Defensible answered 25/4, 2023 at 21:38 Comment(0)
B
19

By itself, np.fromiter only supports constructing 1D arrays, and as such, it expects an iterable that will yield individual values rather than tuples/lists/sequences etc. One way to work around this limitation would be to use itertools.chain.from_iterable to lazily 'unpack' the output of your generator expression into a single 1D sequence of values:

import numpy as np
from itertools import chain

def fun(i):
    return tuple(4*i + j for j in range(4))

a = np.fromiter(chain.from_iterable(fun(i) for i in range(5)), 'i', 5 * 4)
a.shape = 5, 4

print(repr(a))
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15],
#        [16, 17, 18, 19]], dtype=int32)
Barns answered 1/12, 2015 at 12:20 Comment(0)
D
1

Short update on the question: with NumPy=1.23 it is now possible to do exactly what is given in the example:

import numpy as np

def fun(i):
    """A function returning 4 values of the same type."""
    return tuple(4*i + j for j in range(4))

# Trying to create a 2-dimensional array from it:
a = np.fromiter((fun(i) for i in range(5)), dtype='4i', count=5)
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15],
#        [16, 17, 18, 19]], dtype=int32)

Personally, I find it more readable to pass the datatypes directly instead of using the strings (not that 'i' results in int32 and not the standard int64):

a = np.fromiter((fun(i) for i in range(5)), dtype=np.dtype((int, 4)), count=5)
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15],
#        [16, 17, 18, 19]])

See also the documentation of fromiter which contains a similar example.

Defensible answered 25/4, 2023 at 21:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.