What is the best way to handle the background pixel classes (ignore_label), when training deep learning models for semantic segmentation?

Asked 28/11, 2019 at 6:53 Answered 18/2, 2021 at 7:36

Solved tensorflow machine-learning deep-learning pytorch semantic-segmentation

I am trying to train a UNET model on the cityscapes dataset which has 20 'useful' semantic classes and a bunch of background classes that can be ignored (ex. sky, ego vehicle, mountains, street lights). To train the model to ignore these background pixels I am using the following popular solution on the internet :

I assign a common ignore_label (ex: ignore_label=255) for all the pixels belonging to the ignore classes
Train the model using the cross_entropy loss for each pixel prediction
Provide the ignore_label parameter in the cross_entropy loss, therefore the loss computed ignores the pixels with the unnecessary classes.

But this approach has a problem. Once trained, the model ends up classifying these background pixels as belonging to one of the 20 classes instead. This is expected as in the loss we do not penalize the model for whatever classification it makes for the background pixels.

The second obvious solution is therefore to use a extra class for all the background pixels. Therefore it is the 21st class in cityscapes. However, here I am worried that I will 'waste' my model's capacity by teaching it to classify this additional unnecessary class.

What is the most accurate way of handling the background pixel classes ?

Abandon answered 28/11, 2019 at 6:53 Comment(0)

Definitely the second solution is the better one. This is the best solution, the background class is definitely and additional class but not an unnecessary one, since in this way there is a clear differentiation between the classes you want to detect and the background.

In fact, this is a standard procedure recommended in segmentation, to assign a class to a background, where background of course represents everything else apart from your specific classes.

Stricken answered 18/2, 2021 at 7:36 Comment(0)

May be you can try using "Dice loss + Inverted Dice loss" which takes into account both foreground and background pixels

Joyous answered 10/2, 2021 at 16:21 Comment(0)

Recommended topics

Hot tags