pretty printing numpy ndarrays using unicode characters
Asked Answered
G

2

6

I have recently noticed that Python printing functionality is not consistent for NumPy ndarays. For example it prints a horizontal 1D array horizontally:

import numpy as np
A1=np.array([1,2,3])
print(A1)
#--> [1 2 3]

but a 1D horizontal array with redundant brackets vertically:

A2=np.array([[1],[2],[3]])
print(A2)
#--> [[1]
#     [2]
#     [3]]

a 1D vertical array horizontally:

A3=np.array([[1,2,3]])
print(A3)
#--> [[1 2 3]]

and a 2D array:

B=np.array([[11,12,13],[21,22,23],[31,32,32]])
print(B)
# --> [[11 12 13]
#      [21 22 23]
#      [31 32 32]]

where the first dimension is now vertical. It gets even worse for higher dimensions as all of them are printed vertically:

C=np.array([[[111,112],[121,122]],[[211,212],[221,222]]])
print(C)
#--> [[[111 112]
#      [121 122]]
#
#     [[211 212]
#      [221 222]]]

A consistent behavior in my opinion would be to print the even dimensions horizontally and odd ones vertically. Using Unicode characters it would be possible to format it nicely. I was wondering if it is possible to create a function to print above arrays as:

A1 --> [1 2 3]
A2 --> ┌┌─┐┌─┐┌─┐┐
       │ 1  2  3 │
       └└─┘└─┘└─┘┘
A3 --> ┌┌─┐┐ # \u250c\u2500\u2510 
       │ 1 │ # \u2502
       │ 2 │
       │ 3 │
       └└─┘┘ # \u2514\u2500\u2518 
B -->  ┌┌──┐┌──┐┌──┐┐ 
       │ 11  21  31 │
       │ 12  22  32 │
       │ 13  23  33 │
       └└──┘└──┘└──┘┘ 

C -->  ┌┌─────────┐┌─────────┐┐
       │ [111 112]  [211 212] │
       │ [121 122]  [221 222] │
       └└─────────┘└─────────┘┘ 

I found this gist which takes care of the different number of digits. I tried to prototype a recursive function to implement the above concept:

 def npprint(A):
     assert isinstance(A, np.ndarray), "input of npprint must be array like"
     if A.ndim==1 :
         print(A)
     else:
         for i in range(A.shape[1]):
             npprint(A[:,i]) 

It kinda works for A1, A2, A3 and B but not for C. I would appreciate if you could help me know how the npprint should be to achieve above output for arbitrary dimension numpy ndarrays?

P.S.1. In Jupyter environment one can use LaTeX \mathtools \underbracket and \overbracket in Markdown. Sympy's pretty printing functionality is also a great start point. It can use ASCII, Unicode, LaTeX...

P.S.2. I'm being told that there is indeed a consistency in the way ndarrays are being printed. however IMHO it is kind of wired and non-intuitive. Having a flexible pretty printing function could help a lot to display ndarrays in different forms.

P.S.3. Sympy guys have already considered both points I have mentioned here. their Matrix module is pretty consistent (A1 and A2 are the same) and they also have a pprint function which does kind of the same thing and I expect from npprint here.

P.S.4. For those who follow up this idea I have integrated everythin here in this Jupyter Notebook

Gamp answered 2/11, 2018 at 21:59 Comment(9)
A2 has shape (3,1). The first dimension is printed vertically. The 2nd as columns. C is (2,2,2), the first is displayed a space separated blocks, the rest as row/columns like 2d B. Note also the use of brackets which match the nesting of the equivalent lists.Shufu
@Shufu ah, true. it is not consistent either. I will fix it.Gamp
A2 doesn't have redundant brackets. Neither does A3. The shapes differ from A1. The brackets matter.Shufu
The numpy display is consistent. The last dimension (inner most) is always columns. 2nd to the last, rows. Then blocks separated with space and brackets and indentation. Then a higher level of separation. Displaying 3d and higher on a 2d screen will always have problems (that applies to writing csv files as well). But realistic, working, arrays are usually too large to display in full regardless of the layout.Shufu
@Shufu ok, now I'm even more confused. Numpy treats scalars also as array_like so a=np.array(1) returns a valid numpy ndarray with a dimension of a.ndim-->0 and an empty tuple for shape a.shape-->(). so Python's print displays tuple of scalars horizontally 1,2,3,4... as the zero's dimension and then putting brackets around them makes the first dimension which is displayed vertically. Also unlike MATLAB numpy doesn't omit extra brackets. My understanding numpy ndarrays are not exactly multidimensional arrays as we know in mathematics but rather advanced python lists.Gamp
The data storage for ndarray is totally different from a list. docs.scipy.org/doc/numpy-1.15.0/reference/arrays.html. A 0d array is not quite the same as an array scalar which isn't quite the same as Python scalar.Shufu
The differences with MATLAB are too many to list in a comment. But a couple key ones - everything in MATLAB is 2d. Even 3d is a thin layer on top of the 2d. That's part of why it omits trailing singleton dimensions (beyond the 2nd). Trailing dimensions are the outermost (Fortran style). Both display the inner 2 dimensions as row/column blocks. A flattened matrix will have size (n,1) (in contrast to the (n,) shape in numpy).Shufu
@Shufu thanks for the explanations. I think the most unintuitive part for me is that it starts from the most inner pair of brackets. while to me the most external seems the first. anyways the npprint function could include other inputs to display the ndarrays according to the default numpy or others.Gamp
@Shufu I added a sample implementation to show what I have in mind. I would appreciate if you could take a look.Gamp
G
8

It was quite a revelation to me to understand numpy arrays are not anything like MATLAB matrices or multidimensional mathematical arrays I had in mind. They are rather homogeneous and uniform nested Python lists. I also understood that the first dimension of a numpy array is the deepest/innermost pairs of square brackets, which are printed horizontally. Then, from there, the second dimension is printed vertically, and the Third vertically with a spaced line...

I think having a ppring function (inspired by Sympy's naming convention) could greatly help. so, I'm going to put a very bad implementation here, hoping it will inspire other advanced Pythoners to come up with better solutions:

def pprint(A):
    if A.ndim==1:
        print(A)
    else:
        w = max([len(str(s)) for s in A]) 
        print(u'\u250c'+u'\u2500'*w+u'\u2510') 
        for AA in A:
            print(' ', end='')
            print('[', end='')
            for i,AAA in enumerate(AA[:-1]):
                w1=max([len(str(s)) for s in A[:,i]])
                print(str(AAA)+' '*(w1-len(str(AAA))+1),end='')
            w1=max([len(str(s)) for s in A[:,-1]])
            print(str(AA[-1])+' '*(w1-len(str(AA[-1]))),end='')
            print(']')
        print(u'\u2514'+u'\u2500'*w+u'\u2518')  

and the result is somewhat acceptable for 1D and 2D arrays:

B1=np.array([[111,122,133],[21,22,23],[31,32,33]])
pprint(B1)

#┌─────────────┐
# [111 122 133]
# [21  22  23 ]
# [31  32  33 ]
#└─────────────┘

this is indeed a very bad code; it only works for integers. Hopefully, others will come up with better solutions.

P.S.1. Eric Wieser has already implemented a very nice HTML prototype for IPython/Jupiter, which can seen here:

enter image description here

You may follow the discussion on numpy mailing list here.

P.S.2. I also posted this idea here on Reddit.

P.S.3 I spent some time to extend the code to 3D dimensional arrays:

def ndtotext(A, w=None, h=None):
    if A.ndim==1:
        if w == None :
            return str(A)
        else:
            s= '['
            for i,AA in enumerate(A[:-1]):
                s += str(AA)+' '*(max(w[i],len(str(AA)))-len(str(AA))+1)
            s += str(A[-1])+' '*(max(w[-1],len(str(A[-1])))-len(str(A[-1]))) +'] '
    elif A.ndim==2:
        w1 = [max([len(str(s)) for s in A[:,i]])  for i in range(A.shape[1])]
        w0 = sum(w1)+len(w1)+1
        s= u'\u250c'+u'\u2500'*w0+u'\u2510' +'\n'
        for AA in A:
            s += ' ' + ndtotext(AA, w=w1) +'\n'    
        s += u'\u2514'+u'\u2500'*w0+u'\u2518'
    elif A.ndim==3:
        h=A.shape[1]
        s1=u'\u250c' +'\n' + (u'\u2502'+'\n')*h + u'\u2514'+'\n'
        s2=u'\u2510' +'\n' + (u'\u2502'+'\n')*h + u'\u2518'+'\n'
        strings=[ndtotext(a)+'\n' for a in A]
        strings.append(s2)
        strings.insert(0,s1)
        s='\n'.join(''.join(pair) for pair in zip(*map(str.splitlines, strings)))
    return s

and as an example:

shape = 4, 3, 6
B2=np.arange(np.prod(shape)).reshape(shape)
print(B2)
print(ndtotext(B2))        


[[[ 0  1  2  3  4  5]
  [ 6  7  8  9 10 11]
  [12 13 14 15 16 17]]

 [[18 19 20 21 22 23]
  [24 25 26 27 28 29]
  [30 31 32 33 34 35]]

 [[36 37 38 39 40 41]
  [42 43 44 45 46 47]
  [48 49 50 51 52 53]]

 [[54 55 56 57 58 59]
  [60 61 62 63 64 65]
  [66 67 68 69 70 71]]]
┌┌───────────────────┐┌───────────────────┐┌───────────────────┐┌───────────────────┐┐
│ [0  1  2  3  4  5 ]  [18 19 20 21 22 23]  [36 37 38 39 40 41]  [54 55 56 57 58 59] │
│ [6  7  8  9  10 11]  [24 25 26 27 28 29]  [42 43 44 45 46 47]  [60 61 62 63 64 65] │
│ [12 13 14 15 16 17]  [30 31 32 33 34 35]  [48 49 50 51 52 53]  [66 67 68 69 70 71] │
└└───────────────────┘└───────────────────┘└───────────────────┘└───────────────────┘┘
Gamp answered 6/11, 2018 at 1:20 Comment(3)
It's not documented and might go away at any point, but if you don't care about that, you can use fmt = numpy.core.arrayprint._get_format_function(A), and then call fmt in place of str in that example - then it will work for other types too, and you won't need to deal with the column spacing yourself.Cartelize
@Cartelize Thanks a lot for the awesome prototype. In adition to the points I mentioned in my reply on Numpy mailing list, I also see that you have stacked the latest dimentionion vertically. I think it would be way better to follow the even-horizontal odd-vertical convention. Plus here I asked the Variable Inspector guys to see if they can add anything like this to the Jupyter Notebook extension.Gamp
@FoadS.Farimani could you share if you have access to code by Eric Wieser Thank youBlakey
D
0

In each of these cases, each instance of your final dimension is printed on a single line. There's nothing inconsistent here.

Try various forms of:

a = np.random.rand(5, 4, 3)
print(a)

Change the number of dimensions in a (e.g. by adding more integers separated by commas). You'll find that each time you print a, each row in the printed object will have k values, where k is the last integer in a's shape.

Defrock answered 3/11, 2018 at 16:54 Comment(8)
Thanks for the response. I think the main issue, as I have also mentioned in the comments of the OP is that Numpy ndarrays are not exactly the multidimensional mathematical arrays but rather advance python lists, which are basically pointers and addresses. numpy does not treat ndarrays as similar as MATLAB. anyways having the results pretty printed would help a lot. Sympy has some uncode functionality. numpy could have it too.Gamp
@Foad what makes you say numpy arrays are not multidimensional arrays? Numpy is not Matlab, correctDefrock
for example MATLAB omits the extra brackets, as you would expect mathematically. It seems every time we put brackets around a tuple of values we are creating a pointer towards a C array.Gamp
@Foad I don't know what you mean by "extra" brackets -- brackets denote the dimensionality of the given object; there's no such thing as extra brackets in numpy. Tuples are quite different from numpy arrays as well. I'm sorry but I'm not following your confusion.Defrock
that's why it is called confusion :)) well, if you try MATLAB/Octave/Scilab/Julia A1 and A2 would give the same result. and by tuple I mean 1,2,3... this is a valid Python tuple, but not relevant to the discussion anyway. in the end the reason I posted this question is to have pretty printing regardless of how numpy treats ndarrays having petrified results in different forms would help.Gamp
If you just want to smash everything to a consistent 1D, you could always print(my_array.ravel()). In that case A1 and A2 would print the same values...Defrock
nah... what I would like to have in the end is to pretty print ndarrays using unicode. Like what we already have in Sympy or to some extend Pandas...Gamp
I would appreciate if you could take a look at my sample code. maybe you can improve and built upon it?Gamp

© 2022 - 2024 — McMap. All rights reserved.