Is there a way to call a Numpy function inside a TensorFlow session?
Asked Answered
R

2

6

I am trying to implement a Expectation Maximization algorithm using TensorFlow and TensorFlow Probability. It worked very well until I tried to implement Missing Data (data can contain NaN values in some random dimensions).

The problem is that with Missing Data I can no longer do all the operations as vector operations, I have to work with indexing and for-loops, like this:

    # Here we iterate through all the data samples
    for i in range(n):
        # x_i is the sample i
        x_i = tf.expand_dims(x[:, i], 1)
        gamma.append(estimate_gamma(x_i, pi, norm, ber))
        est_x_n_i = []
        est_xx_n_i = []
        est_x_b_i = []
        for j in range(k):
            mu_k = norm.mean()[j, :]
            sigma_k = norm.covariance()[j, :, :]
            rho_k = ber.mean()[j, :]
            est_x_n_i.append(estimate_x_norm(x_i[:d, :], mu_k, sigma_k))
            est_xx_n_i.append(estimate_xx_norm(x_i[:d, :], mu_k, sigma_k))
            est_x_b_i.append(estimate_x_ber(x_i[d:, :], rho_k))
        est_x_n.append(tf.convert_to_tensor(est_x_n_i))
        est_xx_n.append(tf.convert_to_tensor(est_xx_n_i))
        est_x_b.append(tf.convert_to_tensor(est_x_b_i))

What I found out was that these operations are not very efficient. While the first samples took about less than 1 second per sample, after 50 samples it took about 3 seconds per sample. I guess that this was happening because I was creating different tensors inside the session and that was messing up the memory or something.

I am quite new using TensorFlow and a lot of people only use TensorFlow for Deep Learning and Neural Networks so I couldn't find a solution for this.

Then I tried to implement the previous for-loop and the functions called inside that loop using only numpy arrays and numpy operations. But this returned the following error:

You must feed a value for placeholder tensor 'Placeholder_4' with dtype double and shape [8,18]

This error happens because when it tries to execute the numpy functions inside the loop, the placeholder has not been fed yet.

pi_k, mu_k, sigma_k, rho_k, gamma_ik, exp_loglik = exp_max_iter(x, pi, dist_norm, dist_ber)
pi, mu, sigma, rho, responsability, NLL[i + 1] = sess.run([pi_k, mu_k, sigma_k, rho_k, gamma_ik, exp_loglik],{x: samples})

Is there any way to solve this? Thanks.

Repute answered 27/7, 2019 at 23:41 Comment(5)
haven't really looked at you question in detail, but the title looks like you might want to check out tf.py_func?Di
@teng it seems like it could be useful, I will try it.Repute
@teng I don't think is working correctly... Could you explain how to use it? The documentation and the examples is not very clear....Repute
The tf doc is indeed not very clear, if you still do decide to use tf.py_func, here is one working example github.com/Johswald/Bayesian-FlowNet/blob/master/flownet.py . In case you decide to migrate your code to pure tf, there should be methods in tf to help remove/ignore missing data - if you choose this route, search for solutions in SO and if you aren't able to solve it, you can create a new question.Di
Also, the current scope of the question is bit too broad, thus it is not getting too much attention. It would also help if you post some minimum (non-)working code with sample input data so that people can try to ticker with the code.Di
D
1

To answer your title question "Is there a way to call a Numpy function inside a TensorFlow session?", I've put in place below some sample code to execute a "numpy function" (sklearn.mixture.GaussianMixture) given missing data by directly calling the function or via Tensorflow's py_function. I am sensing this may not 100% be what you are looking for... in the case that you are just trying to implement EM..? the existing implementation of Gaussian Mixture Model in Tensorflow may be of some help:

documentation on tf.contrib.factorization.gmm: https://www.tensorflow.org/api_docs/python/tf/contrib/factorization/gmm

implementation: https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/contrib/factorization/python/ops/gmm_ops.py#L462-L506

Sample code to call a 'numpy function' directly and within Tensorflow graph:

import numpy as np
np.set_printoptions(2)
import tensorflow as tf
from sklearn.mixture import GaussianMixture as GMM

def myfunc(x,istf=True):
    #strip nans
    if istf:
        mask = ~tf.is_nan(x)
        x = tf.boolean_mask(x,mask)
    else:
        ind=np.where(~np.isnan(x))
        x = x[ind]
    x = np.expand_dims(x,axis=-1)
    gmm = GMM(n_components=2)
    gmm.fit(x)
    m0,m1 = gmm.means_[:,0]    
    return np.array([m0,m1])
# create data with nans
np.random.seed(42)
x = np.random.rand(5,28,1)
c = 5
x.ravel()[np.random.choice(x.size, c, replace=False)] = np.nan

# directly call "numpy function"
for ind in range(x.shape[0]):
    val = myfunc(x[ind,:],istf=False)
    print(val)
    [0.7  0.26]
    [0.15 0.72]
    [0.77 0.2 ]
    [0.65 0.23]
    [0.35 0.87]
# initialization
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# create graph
X = tf.placeholder(tf.float32, [28,1])
Y = tf.py_function(myfunc,[X],[tf.float32],name='myfunc')

# call "numpy function" in tensorflow graph
for ind in range(x.shape[0]):
    val = sess.run(Y, feed_dict={X: x[ind,:],})
    print(val)
    [array([0.29, 0.76], dtype=float32)]
    [array([0.72, 0.15], dtype=float32)]
    [array([0.77, 0.2 ], dtype=float32)]
    [array([0.23, 0.65], dtype=float32)]
    [array([0.35, 0.87], dtype=float32)]
Di answered 28/7, 2019 at 16:37 Comment(0)
S
1

You can convert your numpy function into tensorflow function then it might not create problem when calling inside a session a simple function is following. Make an IOU function in numpy and then call it via tf.numpy_functionhere

def IOU(Pred, GT, NumClasses, ClassNames):
    ClassIOU=np.zeros(NumClasses)#Vector that Contain IOU per class
    ClassWeight=np.zeros(NumClasses)#Vector that Contain Number of pixel per class Predicted U Ground true (Union for this class)
    for i in range(NumClasses): # Go over all classes
        Intersection=np.float32(np.sum((Pred==GT)*(GT==i)))# Calculate class intersection
        Union=np.sum(GT==i)+np.sum(Pred==i)-Intersection # Calculate class Union
        if Union>0:
            ClassIOU[i]=Intersection/Union# Calculate intesection over union
            ClassWeight[i]=Union
            
    # b/c we will only take the mean over classes that are actually present in the GT
    present_classes = np.unique(GT) 
    mean_IOU = np.mean(ClassIOU[present_classes])
    # append it in final results
    ClassNames = np.append(ClassNames, 'Mean')
    ClassIOU = np.append(ClassIOU, mean_IOU)
    ClassWeight = np.append(ClassWeight, np.sum(ClassWeight))
    
    return mean_IOU
# an now call as
NumClasses=6
ClassNames=['Background', 'Class_1', 'Class_1',
            'Class_1 ', 'Class_1', 'Class_1 ']
x = tf.numpy_function(IOU, [y_pred, y_true, NumClasses, ClassNames], 
                        tf.float64, name=None)
Screeching answered 9/4, 2021 at 6:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.