Keras: What if the size of data is not divisible by batch_size?

About

Asked 22/6, 2016 at 17:5 Answered 18/4, 2018 at 16:59

I am new to Keras and just started working on some examples. I am dealing with the following problem: I have 4032 samples and use about 650 of them as for the fit or basically the training state and then use the rest for testing the model. The problem is that I keep getting the following error:

Exception: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size.

I understand why I am getting this error, my question is, what if the size of my data is not divisible by batch_size? I used to work with Deeplearning4j LSTM and did not have to deal with this problem. Is there anyway to get around with this?

Thanks

Demitria answered 22/6, 2016 at 17:5 Comment(10)

As far as getting around it is concerned, change the batch size. If the number of samples is a prime number, drop 1 or 2 examples. Regarding, why this error occurs in Keras and not in Deeplearning4j, I am not sure. – Issuant 22/6, 2016 at 17:7

Thanks for the suggestion but I was kind of hoping to get results without having to drop some samples. – Demitria 22/6, 2016 at 17:13

You don't have to drop samples. 650 is not a prime number. If your total number of samples is a prime number, then it won't matter what batch size you choose, it will not be divisible. In your case, you can choose batch size 5, 10, 65, etc. Is that a real issue for you? In my experience, changing batch size in reasonable limits won't affect performance too much. – Issuant 22/6, 2016 at 17:17

Sometime the input size may be a prime number where in that case I have to choose a different batch size. – Demitria 22/6, 2016 at 17:24

Also, this is a requirement only in stateful networks in Keras. I worked with Keras extensively for implementing CNNs. I didn't have any such requirement then. – Issuant 22/6, 2016 at 17:47

So does that mean if I switch to stateful=False then this would no longer be an issue? Btw, if stateful is False, is the model still an LSTM? I am using the network from one of the examples (stateful_lstm.py). Sorry if my questions are simple but I am a newbie :) Thanks – Demitria 22/6, 2016 at 18:11

No. Don't make any changes in the network architecture. In my opinion, you are overthinking about this issue. If you have 650 training samples, make batch size 50, 65 etc. On the other hand, drop one or two samples to make it divisible by batch size (example, 743 samples, its prime, so no batch size will help, so drop one sample, make it 742, that is divisible). Neural network performance won't be affected by one or ten samples more or less. If you have a dataset where removing 10 samples means removing 10% of the data, maybe you should think of some other method than neural networks. – Issuant 22/6, 2016 at 18:17

The thing is that I am dealing with 50 datasets each have a different size and basically I must use certain amount of samples as for test (due to some benchmark restrictions). For now, I'll stick to the batch size 64 and try to make number of samples divisible by that. Also, any useful reference so I can read more about stateful networks? Once again, thank you so much. – Demitria 22/6, 2016 at 18:38

Curious: Did you stop using DL4J? If so, why? – Branle 22/6, 2016 at 19:44

@tremstat No, I just wanted to compare the results of both – Demitria 22/6, 2016 at 19:59

The simplest solution is to use fit_generator instead of fit. I write a simple dataloader class that can be inherited to do more complex stuff. It would look something like this with get_next_batch_data redefined to whatever your data is including stuff like augmentation etc..

class BatchedLoader():
    def __init__(self):
        self.possible_indices = [0,1,2,...N] #(say N = 33)
        self.cur_it = 0
        self.cur_epoch = 0

    def get_batch_indices(self):
        batch_indices = self.possible_indices [cur_it : cur_it + batchsize]
        # If len(batch_indices) < batchsize, the you've reached the end
        # In that case, reset cur_it to 0 and increase cur_epoch and shuffle possible_indices if wanted
        # And add remaining K = batchsize - len(batch_indices) to batch_indices


    def get_next_batch_data(self):
        # batch_indices = self.get_batch_indices()
        # The data points corresponding to those indices will be your next batch data

Huntington answered 18/4, 2018 at 16:59 Comment(2)

does it mean that you repeat the training with some of the samples from the beginning of the data? – Stabile 28/4, 2019 at 11:25

Yeah, either that or you can just have a smaller final batch. Vast majority of OPs are agnostic to first dimension (i.e., batch size) anyways – Huntington 30/4, 2019 at 14:3

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags