I have created a working CNN model in Keras/Tensorflow, and have successfully used the CIFAR-10 & MNIST datasets to test this model. The functioning code as seen below:
import keras
from keras.datasets import cifar10
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Conv2D, Flatten, MaxPooling2D
from keras.layers.normalization import BatchNormalization
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
#reshape data to fit model
X_train = X_train.reshape(50000,32,32,3)
X_test = X_test.reshape(10000,32,32,3)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
# Building the model
#1st Convolutional Layer
model.add(Conv2D(filters=64, input_shape=(32,32,3), kernel_size=(11,11), strides=(4,4), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
#2nd Convolutional Layer
model.add(Conv2D(filters=224, kernel_size=(5, 5), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
#3rd Convolutional Layer
model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
#4th Convolutional Layer
model.add(Conv2D(filters=288, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
#5th Convolutional Layer
model.add(Conv2D(filters=160, kernel_size=(3,3), strides=(1,1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(32,32,3,)))
model.add(BatchNormalization())
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
#2nd Fully Connected Layer
model.add(Dense(4096))
model.add(BatchNormalization())
model.add(Activation('relu'))
#Add Dropout
model.add(Dropout(0.4))
#3rd Fully Connected Layer
model.add(Dense(1000))
model.add(BatchNormalization())
model.add(Activation('relu'))
#Add Dropout
model.add(Dropout(0.4))
#Output Layer
model.add(Dense(10))
model.add(BatchNormalization())
model.add(Activation('softmax'))
#compile model using accuracy to measure model performance
opt = keras.optimizers.Adam(learning_rate = 0.0001)
model.compile(optimizer=opt, loss='categorical_crossentropy',
metrics=['accuracy'])
#train the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=30)
From this point after utilising the aforementioned datasets, I wanted to go one further and use a dataset with more channels than a greyscale or rgb presented, hence the inclusion of a hyperspectral dataset. When looking for a hyperspectral dataset I came across this one.
The issue at this stage was realising that this hyperspectral dataset was one image, with each value in the ground truth relating to each pixel. At this stage I reformatted the data from this into a collection of hyperspectral data/pixels.
Code reformatting corrected dataset for x_train & x_test:
import keras
import scipy
import numpy as np
import matplotlib.pyplot as plt
from keras.utils import to_categorical
from scipy import io
mydict = scipy.io.loadmat('Indian_pines_corrected.mat')
dataset = np.array(mydict.get('indian_pines_corrected'))
#This is creating the split between x_train and x_test from the original dataset
# x_train after this code runs will have a shape of (121, 145, 200)
# x_test after this code runs will have a shape of (24, 145, 200)
x_train = np.zeros((121,145,200), dtype=np.int)
x_test = np.zeros((24,145,200), dtype=np.int)
xtemp = np.array_split(dataset, [121])
x_train = np.array(xtemp[0])
x_test = np.array(xtemp[1])
# x_train will have a shape of (17545, 200)
# x_test will have a shape of (3480, 200)
x_train = x_train.reshape(-1, x_train.shape[-1])
x_test = x_test.reshape(-1, x_test.shape[-1])
Code reformatting ground truth dataset for Y_train & Y_test:
truthDataset = scipy.io.loadmat('Indian_pines_gt.mat')
gTruth = truthDataset.get('indian_pines_gt')
#This is creating the split between Y_train and Y_test from the original dataset
# Y_train after this code runs will have a shape of (121, 145)
# Y_test after this code runs will have a shape of (24, 145)
Y_train = np.zeros((121,145), dtype=np.int)
Y_test = np.zeros((24,145), dtype=np.int)
ytemp = np.array_split(gTruth, [121])
Y_train = np.array(ytemp[0])
Y_test = np.array(ytemp[1])
# Y_train will have a shape of (17545)
# Y_test will have a shape of (3480)
Y_train = Y_train.reshape(-1)
Y_test = Y_test.reshape(-1)
#17 binary categories ranging from 0-16
#Y_train one-hot encode target column
Y_train = to_categorical(Y_train)
#Y_test one-hot encode target column
Y_test = to_categorical(Y_test, num_classes = 17)
My thought process was that, despite the initial image being broken down into 1x1 patches, the large number of channels each patch possessed with their respective values would aid in categorisation of the dataset.
Essentially I'd want to input this reformatted data into my model (seen within the first code fragment in this post), however I'm uncertain if I am taking the wrong approach to this due to my inexperience with this area of expertise. I was expecting to input a shape of (1,1,200), i.e the shape of x_train & x_test would be (17545,1,1,200) & (3480,1,1,200) respectively.