I know the imbalance in an image classification problem such as the cat vs dog classification,if there are too many cat images and too few dog images. But I don't know how to adress an imbalance in a segmentation problem.
For example,my task is to mask cloud cover from satellite images, so I transform the problem to two classes of segmentation, one is cloud, the other is background. The dataset has 5800 4-band-16bits images with size of 256*256. The architecture is Segnet, the loss function is binary crossentropy.
There are two cases assumed:
- Half of all samples is covered fully by clouds, half is without any cloud.
- In every image, half is covered by cloud, half is not.
So,case 2 is balanced I guess, but how about case 1?
In reality and my task, the two cases are impossible in source satellite image since the cloud cover is always relative small against the background, but if the image samples are cropped from source images because of their big size, some new cases emerge.
So, the samples always contain three types of images:
- fully covered by clouds (254 in 5800 samples).
- without any cloud (1241 in 5800 samples).
- some areas covered by cloud, some areas not. (4305 in 5800, but I don't know the cloud percentage, maybe very high in some samples, maybe little in other samples)
My question:
Are the samples imbalanced and what should I do?
Thanks in advance.