Is there a way to call a Numpy function inside a TensorFlow session?

Asked 27/7, 2019 at 23:41 Answered 9/4, 2021 at 6:35

python tensorflow tensorflow-probability

I am trying to implement a Expectation Maximization algorithm using TensorFlow and TensorFlow Probability. It worked very well until I tried to implement Missing Data (data can contain NaN values in some random dimensions).

The problem is that with Missing Data I can no longer do all the operations as vector operations, I have to work with indexing and for-loops, like this:

    # Here we iterate through all the data samples
    for i in range(n):
        # x_i is the sample i
        x_i = tf.expand_dims(x[:, i], 1)
        gamma.append(estimate_gamma(x_i, pi, norm, ber))
        est_x_n_i = []
        est_xx_n_i = []
        est_x_b_i = []
        for j in range(k):
            mu_k = norm.mean()[j, :]
            sigma_k = norm.covariance()[j, :, :]
            rho_k = ber.mean()[j, :]
            est_x_n_i.append(estimate_x_norm(x_i[:d, :], mu_k, sigma_k))
            est_xx_n_i.append(estimate_xx_norm(x_i[:d, :], mu_k, sigma_k))
            est_x_b_i.append(estimate_x_ber(x_i[d:, :], rho_k))
        est_x_n.append(tf.convert_to_tensor(est_x_n_i))
        est_xx_n.append(tf.convert_to_tensor(est_xx_n_i))
        est_x_b.append(tf.convert_to_tensor(est_x_b_i))

What I found out was that these operations are not very efficient. While the first samples took about less than 1 second per sample, after 50 samples it took about 3 seconds per sample. I guess that this was happening because I was creating different tensors inside the session and that was messing up the memory or something.

I am quite new using TensorFlow and a lot of people only use TensorFlow for Deep Learning and Neural Networks so I couldn't find a solution for this.

Then I tried to implement the previous for-loop and the functions called inside that loop using only numpy arrays and numpy operations. But this returned the following error:

You must feed a value for placeholder tensor 'Placeholder_4' with dtype double and shape [8,18]

This error happens because when it tries to execute the numpy functions inside the loop, the placeholder has not been fed yet.

pi_k, mu_k, sigma_k, rho_k, gamma_ik, exp_loglik = exp_max_iter(x, pi, dist_norm, dist_ber)
pi, mu, sigma, rho, responsability, NLL[i + 1] = sess.run([pi_k, mu_k, sigma_k, rho_k, gamma_ik, exp_loglik],{x: samples})

Is there any way to solve this? Thanks.

Repute answered 27/7, 2019 at 23:41 Comment(5)

haven't really looked at you question in detail, but the title looks like you might want to check out tf.py_func? – Di 27/7, 2019 at 23:52

@teng it seems like it could be useful, I will try it. – Repute 28/7, 2019 at 4:9

@teng I don't think is working correctly... Could you explain how to use it? The documentation and the examples is not very clear.... – Repute 28/7, 2019 at 10:46

The tf doc is indeed not very clear, if you still do decide to use tf.py_func, here is one working example github.com/Johswald/Bayesian-FlowNet/blob/master/flownet.py . In case you decide to migrate your code to pure tf, there should be methods in tf to help remove/ignore missing data - if you choose this route, search for solutions in SO and if you aren't able to solve it, you can create a new question. – Di 28/7, 2019 at 15:20

Also, the current scope of the question is bit too broad, thus it is not getting too much attention. It would also help if you post some minimum (non-)working code with sample input data so that people can try to ticker with the code. – Di 28/7, 2019 at 15:21

To answer your title question "Is there a way to call a Numpy function inside a TensorFlow session?", I've put in place below some sample code to execute a "numpy function" (sklearn.mixture.GaussianMixture) given missing data by directly calling the function or via Tensorflow's py_function. I am sensing this may not 100% be what you are looking for... in the case that you are just trying to implement EM..? the existing implementation of Gaussian Mixture Model in Tensorflow may be of some help:

documentation on tf.contrib.factorization.gmm: https://www.tensorflow.org/api_docs/python/tf/contrib/factorization/gmm

implementation: https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/contrib/factorization/python/ops/gmm_ops.py#L462-L506

Sample code to call a 'numpy function' directly and within Tensorflow graph:

import numpy as np
np.set_printoptions(2)
import tensorflow as tf
from sklearn.mixture import GaussianMixture as GMM

def myfunc(x,istf=True):
    #strip nans
    if istf:
        mask = ~tf.is_nan(x)
        x = tf.boolean_mask(x,mask)
    else:
        ind=np.where(~np.isnan(x))
        x = x[ind]
    x = np.expand_dims(x,axis=-1)
    gmm = GMM(n_components=2)
    gmm.fit(x)
    m0,m1 = gmm.means_[:,0]    
    return np.array([m0,m1])

# create data with nans
np.random.seed(42)
x = np.random.rand(5,28,1)
c = 5
x.ravel()[np.random.choice(x.size, c, replace=False)] = np.nan

# directly call "numpy function"
for ind in range(x.shape[0]):
    val = myfunc(x[ind,:],istf=False)
    print(val)

    [0.7  0.26]
    [0.15 0.72]
    [0.77 0.2 ]
    [0.65 0.23]
    [0.35 0.87]

# initialization
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# create graph
X = tf.placeholder(tf.float32, [28,1])
Y = tf.py_function(myfunc,[X],[tf.float32],name='myfunc')

# call "numpy function" in tensorflow graph
for ind in range(x.shape[0]):
    val = sess.run(Y, feed_dict={X: x[ind,:],})
    print(val)

    [array([0.29, 0.76], dtype=float32)]
    [array([0.72, 0.15], dtype=float32)]
    [array([0.77, 0.2 ], dtype=float32)]
    [array([0.23, 0.65], dtype=float32)]
    [array([0.35, 0.87], dtype=float32)]

Di answered 28/7, 2019 at 16:37 Comment(0)

You can convert your numpy function into tensorflow function then it might not create problem when calling inside a session a simple function is following. Make an IOU function in numpy and then call it via tf.numpy_functionhere

def IOU(Pred, GT, NumClasses, ClassNames):
    ClassIOU=np.zeros(NumClasses)#Vector that Contain IOU per class
    ClassWeight=np.zeros(NumClasses)#Vector that Contain Number of pixel per class Predicted U Ground true (Union for this class)
    for i in range(NumClasses): # Go over all classes
        Intersection=np.float32(np.sum((Pred==GT)*(GT==i)))# Calculate class intersection
        Union=np.sum(GT==i)+np.sum(Pred==i)-Intersection # Calculate class Union
        if Union>0:
            ClassIOU[i]=Intersection/Union# Calculate intesection over union
            ClassWeight[i]=Union
            
    # b/c we will only take the mean over classes that are actually present in the GT
    present_classes = np.unique(GT) 
    mean_IOU = np.mean(ClassIOU[present_classes])
    # append it in final results
    ClassNames = np.append(ClassNames, 'Mean')
    ClassIOU = np.append(ClassIOU, mean_IOU)
    ClassWeight = np.append(ClassWeight, np.sum(ClassWeight))
    
    return mean_IOU
# an now call as
NumClasses=6
ClassNames=['Background', 'Class_1', 'Class_1',
            'Class_1 ', 'Class_1', 'Class_1 ']
x = tf.numpy_function(IOU, [y_pred, y_true, NumClasses, ClassNames], 
                        tf.float64, name=None)

Screeching answered 9/4, 2021 at 6:35 Comment(0)

Recommended topics

Hot tags