X_train = X_train/255.0 model = Sequential() model.add(Conv2D(64, (3, 3), input_shape = X_train.shape[1:])) model.add(Activation("relu")) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(128, (3, 3))) model.add(Activation("relu")) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(64)) model.add(Dense(16)) # added 16 because it model.fit gave error on 15 model.add(Activation('softmax'))

Disclaimer: it's been a few years since I've played with CNNs myself, so I can only pass on some general advice and suggestions.

First of all, I would like to talk about the results you've gotten so far. The first two networks you've trained seem to at least learn something from the training data because they perform better than just randomly guessing.

However: the performance on the test data indicates that the network has not learned anything meaningful because those numbers suggest the network is as good as (or only marginally better than) a random guess.

As for the third network: high accuracy for training data combined with low accuracy for testing data means that your network has overfitted. This means that the network has memorized the training data but has not learned any meaningful patterns.

There's no point in continuing to train a network that has started overfitting. So once the training accuracy increases and testing accuracy decreases for a few epochs consecutively, you can stop training.

Increase the dataset size

Neural networks rely on loads of good training data to learn patterns from. Your dataset contains 15 classes with 15 images each, that is very little training data.

Of course, it would be great if you could get hold of additional high-quality training data to expand your dataset, but that is not always feasible. So a different approach is to artificially expand your dataset. You can easily do this by applying a bunch of transformations to the original training data. Think about: mirroring, rotating, zooming, and cropping.

Remember to not just apply these transformations willy-nilly, they must make sense! For example, if you want a network to recognize a chair, do you also want it to recognize chairs that are upside down? Or for detecting road signs: mirroring them makes no sense because the text, numbers, and graphics will never appear mirrored in real life.

From the brief description of the classes you have (planes and chairs and whatnot...), I think mirroring horizontally could be the best transformation to apply initially. That will already double your training dataset size.

Also, keep in mind that an artificially inflated dataset is never as good as one of the same size that contains all authentic, real images. A mirrored image contains much of the same information as its original, we merely hope it will delay the network from overfitting and hope that it will learn the important patterns instead.

Lower the learning rate

This is a bit of side note, but try lowering the learning rate. Your network seems to overfit in only a few epochs which is very fast. Obviously, lowering the learning rate will not combat overfitting but it will happen more slowly. This means that you can hopefully find an epoch with better overall performance before overfitting takes place.

Note that a lower learning rate will never magically make a bad-performing network good. It's just one way to locate a set of parameters that performs a tad bit better.

Randomize the training data order

During training, the training data is presented in batches to the network. This often happens in a fixed order over all iterations. This may lead to certain biases in the network.

First of all, make sure that the training data is shuffled at least once. You do not want to present the classes one by one, for example first all plane images, then all chairs, etc... This could lead to the network unlearning much of the first class by the end of each epoch.

Also, reshuffle the training data between epochs. This will again avoid potential minor biases because of training data order.

Improve the network design

You've designed a convolutional neural network with only two convolution layers and two fully connected layers. Maybe this model is too shallow to learn to differentiate between the different classes.

Know that the convolution layers tend to first pick up small visual features and then tend to combine these in higher level patterns. So maybe adding a third convolution layer may help the network identify more meaningful patterns.

Obviously, network design is something you'll have to experiment with and making networks overly deep or complex is also a pitfall to watch out for!

Increase the dataset size

Lower the learning rate

Randomize the training data order

Improve the network design

Recommended topics

Hot tags