How to apply outer product for tensors without unnecessary increase of dimensions?

Asked 7/2, 2017 at 16:5 Answered 14/2, 2017 at 7:25

I have two vectors v and w and I want to make a matrix m out of them such that:

m[i, j] = v[i] * w[j]

In other words I want to calculate the outer product of them. I can do it either by using theano.tensor.outer or by adding new indexes to v and v and using the dot product.

m = T.dot(v[:,numpy.newaxis], w[numpy.newaxis,:])

Now, I try to solve a bit more general problem. Instead of two vectors v and w I have two matrices (I call them v and w again) and I would like to calculate an outer product of each row from matrix v with the correspondent row of matrix w (i_th row in the first matrix should be multiplied with the i_th row of the second matrix). So, I would like to do something like that:

m1 = T.tensordot(v[:,:, numpy.newaxis], w[:,:,numpy.newaxis], axes = [[2],[2]])
m[i, j, k] = m1[i, k, j, k]

In other words, m[:,:,k] is the matrix corresponding to outer product of k_th row from the matrix v and k_th row of the matrix w.

I see two problems with the above "solution". First, it is not really a solution, since the second line of the code is not a proper theano code. So, my first question is how to do this "advanced slicing" by forcing some indexes to be equal. For example m[i, k] = a[i, k, i, i, k]. Second, I do not like the fact that I first create a 4D tesnor (m1) from two 2D tensors and then I reduce it back to a 3D tensor. It can be very memory consuming. I guess one can avoid it.

Telford answered 7/2, 2017 at 16:5 Comment(0)

We need to introduce broadcastable dimensions into the two input matrices with dimshuffle and then let broadcasting take care of the elementwise multiplication resulting in outer-product between coresponding rows of them.

Thus, with V and W as the theano matrices, simply do -

V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)

In NumPy, we have np.newaxis to extend dimensions and np.transpose() for permuting dimensions. With theno, there's dimshuffle to do both of these tasks with a mix of listing dimension IDs and x's for introducing new broadcast-able axes.

Sample run

1) Inputs :

# Numpy arrays
In [121]: v = np.random.randint(11,99,(3,4))
     ...: w = np.random.randint(11,99,(3,5))
     ...: 

# Perform outer product on corresponding rows in inputs
In [122]: for i in range(v.shape[0]):
     ...:     print(np.outer(v[i],w[i]))
     ...:     
[[2726 1972 1740 2117 1972]
 [8178 5916 5220 6351 5916]
 [7520 5440 4800 5840 5440]
 [8648 6256 5520 6716 6256]]
[[8554 3458 8918 4186 4277]
 [1786  722 1862  874  893]
 [8084 3268 8428 3956 4042]
 [2444  988 2548 1196 1222]]
[[2945 2232 1209  372  682]
 [2565 1944 1053  324  594]
 [7125 5400 2925  900 1650]
 [6840 5184 2808  864 1584]]

2) Theano part :

# Get to theano : Get the theano matrix versions 
In [123]: V = T.matrix('v')
     ...: W = T.matrix('w')
     ...: 

# Use proposed code
In [124]: OUT = V.dimshuffle(0, 1, 'x')*W.dimshuffle(0, 'x', 1)

# Create a function out of it and then use on input NumPy arrays
In [125]: f = function([V,W], OUT)

3) Verify results :

In [126]: f(v,w)    # Verify results against the earlier loopy results
Out[126]: 
array([[[ 2726.,  1972.,  1740.,  2117.,  1972.],
        [ 8178.,  5916.,  5220.,  6351.,  5916.],
        [ 7520.,  5440.,  4800.,  5840.,  5440.],
        [ 8648.,  6256.,  5520.,  6716.,  6256.]],

       [[ 8554.,  3458.,  8918.,  4186.,  4277.],
        [ 1786.,   722.,  1862.,   874.,   893.],
        [ 8084.,  3268.,  8428.,  3956.,  4042.],
        [ 2444.,   988.,  2548.,  1196.,  1222.]],

       [[ 2945.,  2232.,  1209.,   372.,   682.],
        [ 2565.,  1944.,  1053.,   324.,   594.],
        [ 7125.,  5400.,  2925.,   900.,  1650.],
        [ 6840.,  5184.,  2808.,   864.,  1584.]]])

Gauldin answered 9/2, 2017 at 17:55 Comment(0)

you looking for something like this?

>>> a = b = np.arange(8).reshape([2,4])
>>> a[:,None,:]*b[:,:,None]
array([[[ 0,  0,  0,  0],
        [ 0,  1,  2,  3],
        [ 0,  2,  4,  6],
        [ 0,  3,  6,  9]],

       [[16, 20, 24, 28],
        [20, 25, 30, 35],
        [24, 30, 36, 42],
        [28, 35, 42, 49]]])

Aboveboard answered 9/2, 2017 at 16:30 Comment(0)

I can't believe nobody has tried to use np.einsum.

w
array([[1, 8, 9, 2],
       [1, 2, 9, 0],
       [5, 8, 7, 3],
       [2, 9, 8, 2]])

v 
array([[1, 4, 5, 9],
       [9, 1, 3, 7],
       [9, 6, 1, 5],
       [4, 9, 7, 0]])

for i in range(w.shape[0]):
     print(np.outer(w[i], v[i]))

[[ 1  4  5  9]
 [ 8 32 40 72]
 [ 9 36 45 81]
 [ 2  8 10 18]]
[[ 9  1  3  7]
 [18  2  6 14]
 [81  9 27 63]
 [ 0  0  0  0]]
[[45 30  5 25]
 [72 48  8 40]
 [63 42  7 35]
 [27 18  3 15]]
[[ 8 18 14  0]
 [36 81 63  0]
 [32 72 56  0]
 [ 8 18 14  0]]

np.einsum('ij,ik->ijk', w, v)

array([[[ 1,  4,  5,  9],
        [ 8, 32, 40, 72],
        [ 9, 36, 45, 81],
        [ 2,  8, 10, 18]],

       [[ 9,  1,  3,  7],
        [18,  2,  6, 14],
        [81,  9, 27, 63],
        [ 0,  0,  0,  0]],

       [[45, 30,  5, 25],
        [72, 48,  8, 40],
        [63, 42,  7, 35],
        [27, 18,  3, 15]],

       [[ 8, 18, 14,  0],
        [36, 81, 63,  0],
        [32, 72, 56,  0],
        [ 8, 18, 14,  0]]])

It looks like the equivalent Theano function is theano.tensor.batched_dot (which is supposed to be even faster than einsum), but I have no experience with Theano.

Coping answered 14/2, 2017 at 7:25 Comment(1)

A perfect application of DSL... in this case Einstein notation / summation convention :) – Yeah 14/2, 2017 at 7:50

Recommended topics

Hot tags