Outer product in tensorflow
Asked Answered
S

6

19

In tensorflow, there are nice functions for entrywise and matrix multiplication, but after looking through the docs, I cannot find any internal function for taking an outer product of two tensors, i.e., making a bigger tensor by all possible products of elements of smaller tensors (like numpy.outer):

v_{i,j} = x_i*h_j

or

M_{ij,kl} = A_{ij}*B_{kl}

Does tensorflow have such a function?

Scuba answered 22/11, 2015 at 17:47 Comment(0)
T
23

Yes, you can do this by taking advantage of the broadcast semantics of tensorflow. Size the first out to size 1xN of itself, and the second to size Mx1 of itself, and you'll get a broadcast to MxN of all of the results when you multiply them.

(You can play around with the same thing in numpy to see how it behaves in a simpler context, btw:

a = np.array([1, 2, 3, 4, 5]).reshape([5,1])
b = np.array([6, 7, 8, 9, 10]).reshape([1,5])
a*b

How exactly you do it in tensorflow depends a bit on which axes you want to use and what semantics you want for the resulting multiply, but the general idea applies.

Taphouse answered 22/11, 2015 at 18:8 Comment(2)
Also a[...,None]*b[None,...]Grossman
With Tensorflow 2, it would for example look like a = tf.expand_dims(a, axis=-1) b = tf.expand_dims(b, axis=-1) outer = tf.matmul(a, b, transpose_b=True)Ivonne
H
12

It is somewhat surprising that until recently there was no easy and "natural" way of doing an outer product between arbitrary tensors (also known as "tensor product") in tensorflow, especially given the name of the library...

With tensorflow>=1.6 you can now finally get what you want with a simple:

M = tf.tensordot(A, B, axes=0)

In earlier versions of tensorflow, axes=0 raises a ValueError: 'axes' must be at least 1.. Somehow tf.tensordot() used to need at least one dimension to actually sum over. The easy way out is to simply add a "fake" dimension with tf.expand_dims().

On tensorflow<=1.5 you can thus get the same result as above by doing:

M = tf.tensordot(tf.expand_dims(A, 0), tf.expand_dims(B, 0), axes=[[0],[0]])

This adds a new index of dimension 1 in location 0 for both tensors and then lets tf.tensordot() sum over those indices.

Howe answered 13/7, 2018 at 11:16 Comment(0)
T
7

In case someone else stumbles upon this, according to the tensorflow docs you can use the tf.einsum() function to compute the outer product of two tensors a and b:

# Outer product
>>> einsum('i,j->ij', u, v)  # output[i,j] = u[i]*v[j]
Townscape answered 11/12, 2017 at 10:31 Comment(2)
Yes, but this is tedious, because the first argument to einsum() depends on the shape of the tensors you want to take the outer product of.Howe
You can generalize this to larger tensors A and B of shape (..., x, d) via: tf.einsum('{0}xz,{0}xz->{0}xy'.format(string.ascii_lowercase[:A._rank() - 2]), A, B)Pickar
F
3

tf.multiply (and its '*' shortcut) result in an outer product, whether or not a batch is used. In particular, if the two input tensors have a 3D shapes of [batch, n, 1] and [batch, 1, n] then this op will calculate the outer product for [n,1],[1,n] per each sample in the batch. If there is no batch, so that the two input tensors are 2D, this op will calculate the outer product just the same. On the other hand, while tf.tensordot yields the outer product for 2D matrices, it did not broadcast similarly when a batch was added.

Without a batch:

a_np = np.array([[1, 2, 3]]) # shape: (1,3) [a row vector], 2D Tensor
b_np = np.array([[4], [5], [6]]) # shape: (3,1) [a column vector], 2D Tensor
a = tf.placeholder(dtype='float32', shape=[1, 3])
b = tf.placeholder(dtype='float32', shape=[3, 1])
c = a*b # Result: an outer-product of a,b
d = tf.multiply(a,b) # Result: an outer-product of a,b
e = tf.tensordot(a,b, axes=[0,1]) # Result: an outer-product of a,b

With a batch:

a_np = np.array([[[1, 2, 3]], [[4, 5, 6]]]) # shape: (2,1,3) [a batch of two row vectors], 3D Tensor
b_np = np.array([[[7], [8], [9]], [[10], [11], [12]]]) # shape: (2,3,1) [a batch of two column vectors], 3D Tensor
a = tf.placeholder(dtype='float32', shape=[None, 1, 3])
b = tf.placeholder(dtype='float32', shape=[None, 3, 1])
c = a*b # Result: an outer-product per batch
d = tf.multiply(a,b) # Result: an outer-product per batch
e = tf.tensordot(a,b, axes=[1,2]) # Does NOT result with an outer-product per batch

Running any of these two graphs:

sess = tf.Session()
result_astrix = sess.run(c, feed_dict={a:a_np, b: b_np})
result_multiply = sess.run(d, feed_dict={a:a_np, b: b_np})
result_tensordot = sess.run(e, feed_dict={a:a_np, b: b_np})
print('a*b:')
print(result_astrix)
print('tf.multiply(a,b):')
print(result_multiply)
print('tf.tensordot(a,b, axes=[1,2]:')
print(result_tensordot)
Flosi answered 1/8, 2018 at 19:57 Comment(0)
U
3

As pointed out in the other answers, the outer product can be done using broadcasting:

a = tf.range(10)
b = tf.range(5)
outer = a[..., None] * b[None, ...]

tf.InteractiveSession().run(outer)
# array([[ 0,  0,  0,  0,  0],
#        [ 0,  1,  2,  3,  4],
#        [ 0,  2,  4,  6,  8],
#        [ 0,  3,  6,  9, 12],
#        [ 0,  4,  8, 12, 16],
#        [ 0,  5, 10, 15, 20],
#        [ 0,  6, 12, 18, 24],
#        [ 0,  7, 14, 21, 28],
#        [ 0,  8, 16, 24, 32],
#        [ 0,  9, 18, 27, 36]], dtype=int32)

Explanation:

  • The a[..., None] inserts a new dimension of length 1 after the last axis.
  • Similarly, b[None, ...] inserts a new dimension of length 1 before the first axis.
  • The element-wide multiplication then broadcasts the tensors from shapes (10, 1) * (1, 5) to (10, 5) * (10, 5), computing the outer product.

Where you insert the additional dimensions determines for which dimensions the outer product is computed. For example, if both tensors have a batch size, you can skip that using : which gives a[:, ..., None] * b[:, None, ...]. This can be further abbreviated as a[..., None] * b[:, None]. To perform the outer product over the last dimension and thus supporting any number of batch dimensions, use a[..., None] * b[..., None, :].

Unitarian answered 6/8, 2018 at 11:25 Comment(1)
Actually the only solution that worked for me. Thank you for the clean explanationMesmerism
O
1

I would have commented to MasDra, but SO wouldn't let me as a new registered user.

The general outer product of multiple vectors arranged in a list U of length order can be obtained via

tf.einsum(','.join(string.ascii_lowercase[0:order])+'->'+string.ascii_lowercase[0:order], *U)
Outing answered 6/9, 2018 at 11:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.