Does batch size affects on the accuracy of a model?
Asked Answered
B

1

6

I have trained the Cifar100 dataset using ResNet18 backbone with the proposed technique for the research propose, and I ends up getting some surprising results. I have gone for the two attempts first with the 640 batch size and second one with 320 batch size. The rest all hyperparameters were kept similar.

The accuracy I got for the 640 batch size is: 76.45% The accuracy I got for the 320 batch size is: 78.64%

Can you tell me why this is happening?

According to me, this is just because of covariate shift. The distribution for the each iteration to complete the complete samples can affect on the accuracy. I think, the distribution for the 320 batch size is similar to each other as compare to the 640 batch size, and which leads to the higher accuracy.

Can you explain, and what can be the solution for it?

Bordie answered 21/2, 2022 at 18:44 Comment(0)
S
5

It is much simpler than that. Batch size has a direct relation to the variance of your gradient estimator - bigger batch -> lower variance. Increasing your batch size is approximately equivalent optimization wise to decreasing your learning rate.

For a more in depth analysis, including theoretical arguments refer to https://proceedings.neurips.cc/paper/2019/file/dc6a70712a252123c40d2adba6a11d84-Paper.pdf

Stpeter answered 21/2, 2022 at 21:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.