Parameters shooting to infinity while training after some epochs
Asked Answered
F

1

7

I'm implementing Linear Regression in Tensorflow first time. Initially, I tried it using a linear model but after few iterations of training, my parameter shot up to infinity. So, I changed my model to a quadratic one and again tried training but still after few iterations of epochs, the same thing is happening.

Hence, the parameter in tf.summary.histogram('Weights', W0) is receiving inf as a parameter and similar is the case with W1 and b1.

I wanted to see my parameters in tensorboard(because I've never worked with it) but getting this error.

I have asked the question previously but the slight change was that I was using a linear model which again was giving the same problem(I didn't know that it was due to the parameters going to infinity because I was running this in my Ipython Notebook but when I ran the program in the terminal, the below-mentioned error was generated, which helped me figure out that the problem was due to the parameters shooting to infinity ). In the comments section, I got to know that it was working on someone's PC, and his tensorboard showed that the parameters were actually reaching infinity.

Here is the link to the problem asked earlier. I hope that I've correctly declared Y_ in my program else do correct me!

Here is the code in Tensorflow:

import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
import matplotlib.pyplot as plt

boston=load_boston()
type(boston)
boston.feature_names

bd=pd.DataFrame(data=boston.data,columns=boston.feature_names)

bd['Price']=pd.DataFrame(data=boston.target)
np.random.shuffle(bd.values)


W0=tf.Variable(0.3)
W1=tf.Variable(0.2)
b=tf.Variable(0.1)
#print(bd.shape[1])

tf.summary.histogram('Weights', W0)
tf.summary.histogram('Weights', W1)
tf.summary.histogram('Biases', b)



dataset_input=bd.iloc[:, 0 : bd.shape[1]-1];
#dataset_input.head(2)

dataset_output=bd.iloc[:, bd.shape[1]-1]
dataset_output=dataset_output.values
dataset_output=dataset_output.reshape((bd.shape[0],1)) 
#converted (506,) to (506,1) because in pandas
#the shape was not changing and it was needed later in feed_dict


dataset_input=dataset_input.values  #only dataset_input is in DataFrame form and converting it into np.ndarray


dataset_input = np.array(dataset_input, dtype=np.float32) 
#making the datatype into float32 for making it compatible with placeholders

dataset_output = np.array(dataset_output, dtype=np.float32)

X=tf.placeholder(tf.float32, shape=(None,bd.shape[1]-1))
Y=tf.placeholder(tf.float32, shape=(None,1))

Y_=W0*X*X + W1*X + b    #Hope this equation is rightly written
#Y_pred = tf.add(tf.multiply(tf.pow(X, pow_i), W), Y_pred)
print(X.shape)
print(Y.shape)


loss=tf.reduce_mean(tf.square(Y_-Y))
tf.summary.scalar('loss',loss)

optimizer=tf.train.GradientDescentOptimizer(0.001)
train=optimizer.minimize(loss)

init=tf.global_variables_initializer()#tf.global_variables_initializer()#tf.initialize_all_variables()
sess=tf.Session()
sess.run(init)



wb_=[]
with tf.Session() as sess:
    summary_merge = tf.summary.merge_all()

    writer=tf.summary.FileWriter("Users/ajay/Documents",sess.graph)

    epochs=10
    sess.run(init)

    for i in range(epochs):
        s_mer=sess.run(summary_merge,feed_dict={X: dataset_input, Y: dataset_output})  #ERROR________ERROR
        sess.run(train,feed_dict={X:dataset_input,Y:dataset_output})

        #CHANGED
        sess.run(loss, feed_dict={X:dataset_input,Y:dataset_output})
        writer.add_summary(s_mer,i)

        #tf.summary.histogram(name="loss",values=loss)
        if(i%5==0):
            print(i, sess.run([W0,W1,b]))
            wb_.append(sess.run([W0,W1,b]))

print(writer.get_logdir())
print(writer.close())

I'm getting this error :

 /anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
(?, 13)
(?, 1)
2018-07-22 02:04:24.826027: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
0 [-3833776.2, -7325.9595, -15.471448]
5 [inf, inf, inf]
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
    return fn(*args)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Infinity in summary histogram for: Biases
     [[Node: Biases = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Biases/tag, Variable_2/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "LR.py", line 75, in <module>
    s_mer=sess.run(summary_merge,feed_dict={X: dataset_input, Y: dataset_output})  #ERROR________ERROR
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Infinity in summary histogram for: Biases
     [[Node: Biases = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Biases/tag, Variable_2/read)]]

Caused by op 'Biases', defined at:
  File "LR.py", line 24, in <module>
    tf.summary.histogram('Biases', b)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/summary/summary.py", line 187, in histogram
    tag=tag, values=values, name=scope)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 283, in histogram_summary
    "HistogramSummary", tag=tag, values=values, name=name)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3414, in create_op
    op_def=op_def)
  File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1740, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Infinity in summary histogram for: Biases
     [[Node: Biases = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Biases/tag, Variable_2/read)]]
Fortaleza answered 23/7, 2018 at 17:31 Comment(2)
I've reduced the gradient decent value but still the problem persists.Fortaleza
The answer below is still not working !!Fortaleza
O
0

I believe this is caused due to high learning rate for Gradient descent. Please refer Gradient descent explodes if learning rate is too large

Here the loss is actually getting bigger after each epoch.

I changed

optimizer=tf.train.GradientDescentOptimizer(0.001)

to

optimizer=tf.train.GradientDescentOptimizer(0.0000000001)

Then printed the loss after each epoch. By changing

sess.run(loss, feed_dict={X:dataset_input,Y:dataset_output})

to

print("loss",sess.run(loss, feed_dict={X:dataset_input,Y:dataset_output}))

in your code. The error was gone. The output was

(?, 13)
(?, 1)
loss =  44061484.0
0 [-0.08337769, 0.19926739, 0.099998444]
loss =  3373030.2
loss =  258605.05
loss =  20211.799
loss =  1964.4918
loss =  567.7717
5 [-0.0001616638, 0.19942635, 0.099998794]
loss =  460.862
loss =  452.67877
loss =  452.05255
loss =  452.00452
Users/ajay/Documents
None
Obvert answered 15/8, 2018 at 8:1 Comment(3)
No, it is still giving me the same error and I've already tried your solution before.Fortaleza
That is strange. Please find the full code here. gist.github.com/SragAR/77c084a6461d5fb821293efa199aed1dObvert
I fixed the problem in #51873092Fortaleza

© 2022 - 2024 — McMap. All rights reserved.