how to understand empty dimension in python numpy array?

Asked 4/4, 2016 at 21:0 Answered 8/9, 2018 at 14:59

In python numpy package, I am having trouble understanding the situation where an ndarray has the 2nd dimension being empty. Here is an example:

    In[1]: d2 = np.random.rand(10)
    In[2]: d2.shape = (-1, 1)

    In[3]: print d2.shape
    In[4]: print(d2)

    In[5]: print d2[::2, 0].shape
    In[6]: print d2[::2, 0]

    Out[3]:(10, 1)
    Out[4]:
[[ 0.12362278]
 [ 0.26365227]
 [ 0.33939172]
 [ 0.91501369]
 [ 0.97008342]
 [ 0.95294087]
 [ 0.38906367]
 [ 0.1012371 ]
 [ 0.67842086]
 [ 0.23711077]]

    Out[5]: (5,)
    Out[6]: [ 0.12362278  0.33939172  0.97008342  0.38906367  0.67842086]

My understanding is that d2 is a 10 rows by 1 column ndarray. Out[6] is obviously a 1 by 5 array, how can the dimensions be (5,) ? What does the empty 2nd dimension mean?

Drillmaster answered 4/4, 2016 at 21:0 Comment(2)

(5,) is just Python's way of stringifying a tuple with one entry, because (5) would be ambiguous (or rather just 5). – Phosphate 4/4, 2016 at 21:6

"Out[6] is obviously a 1 by 5 array" - no, there's no "1 by" on that array. It's 1-dimensional, with its only dimension having length 5. – Astroid 4/4, 2016 at 22:34

Let me just give you one example that illustrate one important difference.

d1 = np.array([1,2,3,4,5]) # array([1, 2, 3, 4, 5])
d1.shape -> (5,) # row array.    
d1.size -> 5
# Note: d1.T is the same as d1.

d2 = d1[np.newaxis] # array([[1, 2, 3, 4, 5]]). Note extra []
d2.shape -> (1,5) 
d2.size -> 5
# Note: d2.T will give a column array
array([[1],
       [2],
       [3],
       [4],
       [5]])
d2.T.shape -> (5,1)

Lactiferous answered 4/4, 2016 at 21:23 Comment(0)

I also thought ndarrays would represent even 1-d arrays as 2-d arrays with a thickness of 1. Maybe because of the name "ndarray" makes us think high dimensional, however, n can be 1, so ndarrays can just have one dimension.

Compare these

x = np.array([[1], [2], [3], [4]])
x.shape
# (4, 1)
x = np.array([[1, 2, 3, 4]])
x.shape
#(1, 4)
x = np.array([1, 2, 3, 4])
x.shape
#(4,)

and (4,) means (4).

If I reshape x and back to (4), it comes back to original

x.shape = (2,2)
x
# array([[1, 2],
#       [3, 4]])
x.shape = (4)
x
# array([1, 2, 3, 4])

Juvenescent answered 8/9, 2018 at 14:59 Comment(0)

The main thing to understand here is that indexing with an integer is different than indexing with a slice. For example, when you index a 1d array or a list with an integer you get a scalar but when you index with a slice, you get an array or a list respectively. The same thing applies to 2d+ arrays. So for example:

# Make a 3d array:
import numpy as np
array = np.arange(60).reshape((3, 4, 5))

# Indexing with ints gives a scalar
print array[2, 3, 4] == 59
# True

# Indexing with slices gives a 3d array
print array[:2, :2, :2].shape
# (2, 2, 2)

# Indexing with a mix of slices and ints will give an array with < 3 dims
print array[0, :2, :3].shape
# (2, 3)
print array[:, 2, 0:1].shape
# (3, 1)

This can be really useful conceptually, because sometimes its great to think of an array as a collection of vectors, for example I can represent N points in space as an (N, 3) array:

n_points = np.random.random([10, 3])
point_2 = n_points[2]
print all(point_2 == n_points[2, :])
# True

Overwind answered 4/4, 2016 at 21:49 Comment(0)

Recommended topics

Hot tags