What is the difference between the file extensions .h5 .hdf5 and .ckpt and which one should I use?

from keras import Sequential from keras_preprocessing.image import ImageDataGenerator from keras.layers import * from keras.callbacks import ModelCheckpoint import numpy as np import os img_size = 500 # number of pixels for width and height #Random Seed np.random.seed(12321) training_path = os.getcwd() + "/cats and dogs images/train" testing_path = os.getcwd() + "/cats and dogs images/test" #Defines the Model model = Sequential([ Conv2D(filters=64, kernel_size=(3,3), activation="relu", padding="same", input_shape=(img_size,img_size,3)), MaxPool2D(pool_size=(2,2), strides=2), Conv2D(filters=64, kernel_size=(3,3), activation="relu", padding="same"), MaxPool2D(pool_size=(2,2), strides=2), Flatten(), Dense(32, activation="relu"), Dense(1, activation="sigmoid") ]) #Scales the pixel values to between 0 to 1 datagen = ImageDataGenerator(rescale=1.0/255.0) #Prepares Training Data training_dataset = datagen.flow_from_directory(directory = training_path, target_size=(img_size,img_size), classes = ["cat","dog"], batch_size = 19) #Prepares Testing Data testing_dataset = datagen.flow_from_directory(directory = testing_path, target_size=(img_size,img_size), classes = ["cat","dog"], batch_size = 19) #Compiles the model model.compile(loss="binary_crossentropy", optimizer="adam", metrics=['accuracy']) #Checkpoint checkpoint = ModelCheckpoint("trained_model.h5", monitor='loss', verbose=1, save_best_only=True, mode='min', period=1) #Fitting the model to the dataset (Training the Model) model.fit(x = training_dataset, steps_per_epoch = 658, validation_data=testing_dataset, validation_steps=658, epochs = 10, callbacks=[checkpoint], verbose = 1) # evaluate model on training dataset acc = model.evaluate_generator(training_dataset, steps=len(training_dataset), verbose=0) print("Accuracy on training dataset:") print('> %.3f' % (acc * 100.0)) #evaluate model on testing dataset acc = model.evaluate_generator(testing_dataset, steps=len(testing_dataset), verbose=0) print("Accuracy on testing dataset:") print('> %.3f' % (acc * 100.0)) ##Saving the Model: #model.save("trained model.h5") #print("Saved model to disk")

What is the difference between the file extensions .h5, .hdf5 and .ckpt ?

.h5 and .hdf5

According to this both .h5 and .hdf5 are basically the same, it is a data file saved in the Hierarchical Data Format (HDF), It contains multidimensional arrays of scientific data.

And according to this saving a model using that format results in saving the model with the following:

The weight values.
The model's architecture.
The model's training configuration (what you pass to the .compile() method)
The optimizer and its state, if any (this enables you to restart training where you left off)

.ckpt

It is short for checkpoint, so by its name it's basically to save a state of the model during training after achieving a certain condition (lower than a certain loss value or higher than a certain accuracy value).

Saving model as .ckpt has its setback as it only saves the weights of the variables or the graph, so you will need to have full architectures and functions used to load those weights and variables into the architecture and build and use the model. (basically the code)

This format is mainly used when you want to resume the training and allows you to customize the saved checkpoints and load them as well. (which allows for continuous improving for the model and changing parameters according to results which allows for creating different models from different checkpoints).

Which extension should i use ?

Depends on your goal of training the model, if you are in the training process and experimenting a lot, I would suggest saving the model as a .ckpt format.

If you're done experimenting and finalizing the model, I would suggest saving it as a .h5 format so that you could load it and use it without needing to have the code used to create model architecture.

Also would I need to call model.save(filepath) at the end of the code or would my model be saved automatically by ModelCheckpoint()?

You can call both, but i would suggest having the extension in ModelCheckpoint() be .ckpt so that you can save the highest possible model state during the training process, and when you are done training call model.save(filepath) but as a .h5 format so that after training the model should be saved and used anywhere without the need for the original architecture code.

That way you give yourself the option to enhance training and load the .ckpt model or if you are satisfied with the final result use the .h5 model as a final version for the model.

.h5 and .hdf5

.ckpt

Recommended topics

Hot tags