In Tensorflow 2.2.0, my model.history.history is empty after fitting the data along with validation_data

Asked 10/6, 2020 at 9:9 Answered 20/8 at 3:34

Solved python pandas dataframe tensorflow keras

At first it was working fine, then I tried to tweak a few parameters in creating the model, after that,

print(model.history.history)

gives me an empty dictionary.

here is my entire code if it helps,

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.metrics import mean_absolute_error

df = pd.read_csv('TF_2_Notebooks_and_Data/DATA/kc_house_data.csv')
# print(df.columns)
'''prints
Index(['id', 'date', 'price', 'bedrooms', 'bathrooms', 'sqft_living',
       'sqft_lot', 'floors', 'waterfront', 'view', 'condition', 'grade',
       'sqft_above', 'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode',
       'lat', 'long', 'sqft_living15', 'sqft_lot15'],
      dtype='object')'''
# if we want to see what data column has missing data point,
# print(df.isnull()) #will print 'True' if data is missing
'''
          id   date  price  bedrooms  ...    lat   long  sqft_living15  sqft_lot15
0      False  False  False     False  ...  False  False          False       False
1      False  False  False     False  ...  False  False          False       False
2      False  False  False     False  ...  False  False          False       False
3      False  False  False     False  ...  False  False          False       False
4      False  False  False     False  ...  False  False          False       False
...      ...    ...    ...       ...  ...    ...    ...            ...         ...
21592  False  False  False     False  ...  False  False          False       False
21593  False  False  False     False  ...  False  False          False       False
21594  False  False  False     False  ...  False  False          False       False
21595  False  False  False     False  ...  False  False          False       False
21596  False  False  False     False  ...  False  False          False       False
'''
# print(df.isnull().sum())
'''
id               0
date             0
price            0
bedrooms         0
bathrooms        0
sqft_living      0
sqft_lot         0
floors           0
waterfront       0
view             0
condition        0
grade            0
sqft_above       0
sqft_basement    0
yr_built         0
yr_renovated     0
zipcode          0
lat              0
long             0
sqft_living15    0
sqft_lot15       0
dtype: int64
'''

# describing the data set
# print(df.describe().transpose())

# let us see with histogram the prices of the houses
# sns.distplot(df['price'])

# counting bedrooms per house
# sns.countplot(df['bedrooms'])

# removing unwanted data
df = df.drop('id', axis=1)
# changing data style to yyyy-mm-dd
df['date'] = pd.to_datetime(df['date'])
# extracting year from date
df['year'] = df['date'].apply(lambda date: date.year)
df['month'] = df['date'].apply(lambda date: date.month)
# checking if prices are affected by year
# sns.scatterplot(x=df['price'],y=df['month'],hue=df['year'])
# or
# sns.boxplot('month','price',data=df)
# or
# print(df.groupby('month').mean()['price'].plot())

# removing date column
df = df.drop('date', axis=1)
# also drop zipcodes
df = df.drop('zipcode', axis=1)
# print(df['yr_renovated'].value_counts())

X = df.drop('price', axis=1).values
y = df['price'].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

    # print(X_train.shape)
# prints (15117, 19)
model = Sequential()
model.add(Dense(19, activation='relu'))
model.add(Dense(19, activation='relu'))
model.add(Dense(19, activation='relu'))
model.add(Dense(19, activation='relu'))
model.add(Dense(1, activation=None))
model.compile(optimizer='adam', loss='mse')
# adding validation data will not affect the weights and the biases of the model, it is to get an idea of,
# over-fitting or under-fitting the data
#reducing the batch size will make the model more time to train but less over-fitting will occur
model.fit(X_train, y_train, validation_data=(X_test, y_test),
          batch_size=128, epochs=4,verbose=2)

predictions = model.predict(X_test)
# checking if we are over-fitting or no
print(f"model hist is : \n {model.history.history}")
losses = pd.DataFrame(model.history.history)
print(losses)
#losses.plot()
# NOTE: the line curve for loss must match for not over-fitting the data.
#plt.ylabel('losses')
#plt.xlabel('number of epochs')
off_by = mean_absolute_error(y_test, predictions)
print(f"the predictions are off by {off_by} dollars")
print(f"the mean of all the prices is {df['price'].mean()}")
plt.show()

output:

    Epoch 1/4
119/119 - 0s - loss: 430244003840.0000 - val_loss: 418937962496.0000
Epoch 2/4
119/119 - 0s - loss: 429396754432.0000 - val_loss: 415953223680.0000
Epoch 3/4
119/119 - 0s - loss: 417119928320.0000 - val_loss: 387559292928.0000
Epoch 4/4
119/119 - 0s - loss: 354640822272.0000 - val_loss: 283466629120.0000
model hist is : 
 {}
Empty DataFrame
Columns: []
Index: []
the predictions are off by 401518.14752604166 dollars
the mean of all the prices is 540296.5735055795

Process finished with exit code 0

I'm not sure where to go now, the line:

print(f"model hist is : \n {model.history.history}")

prints:

model hist is : 
{}

Since i need to analyse the loss along with validation loss i can't get any further

Colotomy answered 10/6, 2020 at 9:9 Comment(1)

have you tried history = model.fit(...) and then access history.history ?? – Seaden 10/6, 2020 at 10:40

history = model.fit(...)
print(f"model hist is : \n {history.history}")

Seaden answered 10/6, 2020 at 12:17 Comment(0)

Mahmoud Youssef answer should be marked as the correct one. But there is another approach by using the results of CSVLogger callback or you can create a custom one, i.e.

class CSVLogger(tf.keras.callbacks.Callback):
    def __init__(self, dir_results, save_log_epoch, save_loss_batch, separator=','):
        super().__init__()
        self.separator = separator
        self.save_log_epoch = save_log_epoch
        self.save_loss_batch = save_loss_batch
        
        self.dir_results = dir_results

        self.loss_batch_keys = None
        self.loss_batch_file = None
        self.loss_batch_filename = ''

        self.log_keys = None
        self.log_file = None
        self.log_filename = ''

    def on_train_begin(self, logs=None):
        if self.save_log_epoch:
            self.log_filename = self.dir_results + self.model.name + MLConsts.PATTERN_MODEL_LOG + '.log'
            self.log_file = open(self.log_filename, 'a')

        if self.save_loss_batch:
            self.loss_batch_filename = self.dir_results + self.model.name + '_ep0_lossbatch.log'

    def on_train_end(self, logs=None):
        self.log_file.close()

    def on_epoch_begin(self, epoch, logs=None):
        if self.save_loss_batch:
            self.loss_batch_filename = self.dir_results + self.model.name + '_ep' + str(epoch) + '_lossbatch.log'
            self.loss_batch_file = open(self.loss_batch_filename, 'a')

    def on_epoch_end(self, epoch, logs=None):
        if not self.save_log_epoch:
            """do nothing"""
        else:
            logs = logs or {}

            if not self.log_keys:
                self.log_keys = logs.keys()
                self.log_file.write(self.separator.join(self.log_keys) + '\n')

            self.log_file.write(self.separator.join([str(value) for value in logs.values()]) + '\n')
            self.log_file.flush()

        if self.save_loss_batch:
            self.loss_batch_keys = None
            self.loss_batch_file.flush()
            self.loss_batch_file.close()

    def on_batch_end(self, epoch, logs=None):
        if not self.save_loss_batch: return

        logs = logs or {}
        if not self.loss_batch_keys:
            self.loss_batch_keys = logs.keys()
            self.loss_batch_file.write(self.separator.join(self.loss_batch_keys) + '\n')

        self.loss_batch_file.write(self.separator.join([str(value) for value in logs.values()]) + '\n')

and, then, using the data according to your needs.

Beeswax answered 15/12, 2020 at 22:28 Comment(0)

if you use model.fit() followed by model.evaluate() at the same execution or cell run. History will be Empty

This will cause History to be Empty

model.fit(x_train,y_train,batch_size=32,epochs=1,verbose=2,validation_data=(x_test,y_test))  
model.evaluate(x_test,y_test,batch_size=x_test.shape[0],verbose=0)

This How to Solve it

model.fit(x_train,y_train,batch_size=32,epochs=1,verbose=2,validation_data=(x_test,y_test))
model.history.history

Aglimmer answered 21/2, 2022 at 9:24 Comment(2)

Could this be caused also by model.predict() being called within a custom callback? – Inclinometer 21/2, 2022 at 16:19

@user6903745, it looks like so, because I've the simillar issue. I'm calling model.predict() inside callback on_train_end and it causes the history to be empty... I do not know how to fix this unfortunately – Syndicalism 7/10, 2022 at 21:11

The following is a classification problem, from book Artifical Intelligence in Finance. The latter part shows how to draw history using DataFrame. I wrote it here is because the original book also used pd.DataFrame(model.history.history), and that results in empty too. After reading this post, I finally made it work. yeah~ so if anyone wants to use DataFrame to draw history, this may help.

import numpy as np
f =5
n=10
np.random.seed(100)
# input data:
x = np.random.randint(0,2,(n,f))  
# input label:
y = np.random.randint(0,2,n)  
model = Sequential()
model.add(Dense(256, activation='relu', input_dim = f))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['acc'])
# model.fit causes history empty, but it returs history.
history = model.fit(x, y, epochs=50, verbose=False) 
y_ = np.where(model.predict(x).flatten() > 0.5, 1, 0)

import pandas as pd
res = pd.DataFrame(history.history)
res.plot(figsize=(10,6));

Bort answered 20/8 at 3:34 Comment(0)

Recommended topics

Hot tags